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1. INTRODUCTION 


Stimulus sampling theory is concerned with providing a mathematical 
language in which may be expressed assumptions about learning and 
performance in relation to stimulus variables. A special advantage 
of the formulations to be discussed is that their mathematical properties 
permit application of the simple and elegant. theory of Markov chains 
(Feller, 1957; Kemeny, Snell, and Thompson, 1957; Kemeny and Snell, 1959) 
to the tasks of deriving theorems and generating statistical tests of the 
agreement between assumptions and data. This branch of learning theory 
has developed in close interaction with certain types of experimental 
analysis; consequently it will be both natural and convenient to organize 
this presentation around the theoretical treatments of a few standard 
reference experiments. 

At the level of experimental interpretation, most contemporary 
learning theories utilize a common conceptualization of the. learning 
situation in terms of stimulus, response, and reinforcement. The stimulus 
term of this. triumvirate refers to the environmental situation with respect. 
to which behavior is being observed, the response term to the class of 
observable behaviors whose measurable properties change in some orderly 
fashion during learning, and the reinforcement term to the experimental 


operations or events believed to be critical in producing learning. Thus, 
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in.a simple paired-associate experiment concerned with the learning of 
English equivalents to Russian words, the stimulus might consist. in 
presentation of the printed Russian word alone, the response measure in 
the relative frequency with which the learner is able to supply the English 
equivalent from memory, and reinforcement in paired presentation., of the 
stimulus and response words. 

In other chapters of this volume, and in the general literature on 
learning theory, the reader will encounter’ the notions of sets of responses 
and sets of reinforcing events. In the present chapter, mathematical sets 
will be used to represent certain aspects of the stimulus situation. It 
should be emphasized from the outset, however, that the mathematical models 
to be considered are somewhat abstract and that the empirical interpreta- 
tions of stimulus sets and their elements are not to be considered fixed 
and immutable. ‘Two main types of interpretation will be discussed: in 
one of thesé the empirical correspondent ofa stimulus element is the full 
pattern of stimulation effective on a given trial, in the other the 
correspondent of se eas is a component, or aspect, of the full pattern 
of stimulation. In the former case, we speak of “pattern models" and in 
the latter of "component models" (Estes, 1959p). 

There are a number of ways in which characteristics of the stimulus 
situation are known to. affect learning and transfer. Rates and limits of — 
conditioning and learning generally depend upon both stimulus magnitude, 
or intensity, and upon stimulus variability from trial to trial. Retention 
and transfer of learning depend upon the similarity, or communali ty, between 


the stimulus situations obtaining during training and during the test for 





A. and E. 3+ 


retention or transfer. These aspects of the stimulus situation can be 
given direct and natural representations in terms of mathematical sets 
and relations between sets. 

The basic notion common to all stimulus sampling: theories is the 
conceptualization of the totality of stimulus conditions “that. may be 
effective during the course of an experiment in terms of a mathematical 
set. Although it is not a necessary restriction, it is convenient for 
mathematical reasons to deal only with finite sets, and this limitation 
will be assumed throughout our presentation. Stimulus variability is 
taken into account by assuming that of the total population of stimuli 
available in an experimental situation, generally only a part actually 
affects the subject on any one trial. Translating this idea into the 
terms of a stimulus sampling model, one may represent the total popula- 
tion by a set of “stimulus elements” and the stimulation effective on 
any one trial by a sample from this set. Many of the simple mathematical 
properties of the models to be discussed arise from the assumption that 
these trial samples are drawn randomly from the population, with ail 
samples of a given size having equal probabilities. Although it is 
sometimes convenient and suggestive to speak in such terms, one should 
not assume that the stimulus elements are to be identified with any 
simple neurophysiological unit, as, for example, receptor cells. At 
the present stage of theory construction, we mean to assume. only that 
certain properties of the set-theoretical model represent certain 
properties of the process of stimulation. If these assumptions prove 


to be sufficiently well substantiated when the model is tested against 
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behavioral data, then it will be in order to look for neuro- 


physiological variables which might underlie the correspondences. 





Just as the ratio of sample size to population size is a natural 


way of representing stimulus variability, sample size per se may 





be taker, as a correspondent of stimulus intensity, and the amount 
of overlap (i.e., proportion of common elements) between two stimulus 
sets may be taken to represent the degree of communality between 


two stimulus situations. 





Our concern in this chapter is not to survey the rapidly 
developing area of stimulus sampling theory, but simply to present 
some of the fundamental mathematical techniques and illustrate their 
applications. For general background, the reader is referred to 
Bush (1960), Bush and Estes (1959), Estes (1959a, 1962), and Suppes 


and Atkinson (1960). We shall consider first, and in some detail, 





the very simplest of all learning models - the pattern model for 
simple learning, In this model, the population of available 
stimulation is assumed to comprise a set of distinct stimulus 


patterns, exactly one of which is sampled on each triai. In the 





important special case of the one-element model, it is assumed that 


there is only one such pattern and that it recurs intact at the 





beginning of each experimental trial. Granting that the one-element 





model represents a radical idealization of even the most simplified 
conditioning situations, we shall find that it is worthy of study 


not only for expositional purposes but also for its value as an 





analytic device in relation to certain types of learning data. 





A. and E. -5- 


After a relatively thorough treatment of pattern models for simple 

acquisition and for learning under probabilistic reinforcement schedules, | 
we shall take up more briefly the conceptualization of generalization 

and transfer; the component models in whan, the SNe of stimulation 
effective on individual trials are treated, nob ae distinct elements, but 
as overlapping samples from a common population; and, finally, some 
examples: of the more: complex multiple-process models which are becoming 
increasingly important in the analysis of discrimination learning, concept 


formation, and related phenomena. 


2. ONE-ELEMENT MODELS 











We begin by considering some one-element models which are special 
cases of the more general theory. These examples are especially simple 
mathematically and provide us with the opportunity to develop some 
mathematical tools which will be necessary in later discussions. 
Application: of these models is appropriate if the stimulus situation 
is sufficiently stable from trial to trial that it may be theoretically 
represented to a good approximation by a single stimulus element which 
is sampled with probability 1 on each trial. At the start‘of a trial 
the element is in one of several possible conditioning states; it may or 
may not remain in this conditioning state, depending on the reinforcing i 
event for that trial. In the first part of this section we consider a 


model for paired-associate learning which has been intensively analyzed 





by Bower (1961,. 1962). In. the second part’ of this section 
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we consider a one-element model for a two-choice learning situation 

involving a probabilistic reinforcement schedule. The model generates 
some pradi otions which are undoubtedly incorrect, except possibly under 
ideal experimental conditions; nevertheless it provides a useful intro- 


duction to more general cases which we pursue in Section 2. 


2.1 Learning of a Single Stimulus~Response Association 

Imagine the simplest possible learning situation. A single stimulus | 
pattern, S , is, to be presented on each of a series of trials and each 
trial is to terminate with reinforcement of some designated response, the 
“correct response" in this situation. According to stimulus sampling 
theory, learning occurs in an all-or-none fashion with respect to S . 
This means that: . 

1. If the correct response is mae originally conditioned to 
("connected to"). S , then, until. learning occurs, the probability of the 
correct response is zero. 

2. There is a fixed probability c that the reinforced response 
will become conditioned to §$ on any trial. 

3. Once conditioned to S , the correct response occurs with 
probability one on every subsequent trial. 

These assumptions constitute the :simplest case of the “one~element. pattern 
model." Learning situations which completely meet the specifications 
laid down above are as unlikely to be realized in psychological experi- 
ments as perfect vacuums or frictionless planes in the physics Laboxatery: 


However, reasonable approximations to-these. conditions can be attained. 











ie 
i 
ke 








he 
i 
| 
j 








A. and EB. -7- 


The requirement that the same stimulus pattern be reproduced on each 
trial is probably fairly well met in the standard paired-associate 
experiment with human subjects. In one such experiment, conducted in 
the laboratory of one of the writers (W. K. E.), the stimulus member of 
each item was a trigram and the correct response an English word, e.g., 

S Roo 4 

xvk house 
On a reinforced trial the stimulus and response members were exposed 
together, as shown. Then, after several such items had received a 
single reinforcement, each of the stimuli was presented alone, the 
subject being instructed to give the correct response from memory, if 
he could. Then each item was given a second reinforcement, followed by 
a second test, and so on. 

According to the assumptions of the one~element pattern model, a 
subject should be expected to make an incorrect response on each test 
with a given stimulus until learning occurs, then a correct response on 
every subsequent trial; if we represent an error by a 1 and a correct 
response by a 0, the protocol for an individual item over a series of 
trials should, then, consist in a sequence of O's preceded, in most 


eases, by a sequence of 1's. Actual protocols for several subjects are 


shown below: 
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o a 


BH op 0 © Ba 

PrPrPrPH OF FO 
HFOrFFHEOOFrO 
HOoOrFOOO Oo Ho 
HK OrFOOOCOF SO 
OoOrFrOCOC oO OHO 
rFOooOoooOoO OF OO 
FOO OOC oO OFrFO 
orFOOO oO OF O 
ooo oc oO oO Or DO 
ooo oC OO Or DO 


The first seven of these correspond perfectly to the idealized theoretical 
picture; the last two deviate slightly, The proportion of "fits" and 
“misfits” in this sample is about the same as in the full set of 80 cases 
from which the sample was taken. The occasional lapses, i.e., errors 
following correct responses, may. be symptomatic of a forgetting process 
which should be incorporated into the theory or they may be simply the 
result of minor uncontrolled variables in the experimental situation 
which are best ignored for theoretical purposes. Without judging this 
issue, we may conclude that the simple.one-element model at least merits 
further study. 

Before we can make quantitative predictions we need to know the 
value of the conditioning parameter c. Statistical learning theory 
includes no formal axioms specifying precisely what variables determine 
the value of ¢ , .but on the basis of considerable experience we can 
safely assume that this parameter will vary with characteristics of 
the populations of subjects and items represented in a particular 


experiment. An estimate of 
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the value of c for the experiment under consideration is easy to come by. 
In the full set of 80 cases (40 subjects, each tested on two items), the 
proportion of correct responses on the test given after a single 
reinforcement was .39. According to the model, the probability is ec 
that a reinforced response will become conditioned to its paired stimulus; 
consequently, c is the expected pkepontion of successful conditionings 
out of 80 cases, and therefore the expected. proportion of correct responses 
on. the subsequent test. Thus we may simply take the observed proportion, 
-39, as an estimate of c . 

In order to test the model, we need SoH to derive theoretical 
expressions for other aspects of the data. Suppose we consider the 
sequences of correct and incorrect responses, 000, O01, etc., on the 
first three trials. According to the model ; a correct ete should 
never be followed by an error, so the probability of the sequence 000 is 
simply c , and the probabilities of 001, O10, O11, and 101 all zero. 

To obtain an error on the first trial followed by a correct response on. 
the second, conditioning must fail on the first reinforcement but occurs 
on the second, and this joint event has probability (l-c)c . Similarly, 
the probability that the first worneee pegmonse oceurs on the third trial 
is given by (1-c)®e and the probability of no correct response in 
three trials by (1-c)3. Substituting the estimate .39 for c in each 
of these expressions, we obtain the predicted values which are compared 


with the corresponding empirical values for this experiment in Table 1. 





Table 1 about here 









A. and E. -9a- 
Table 1 


Observed and predicted (one-element model) values for response sequences 


over first three trials of a paired associate experiment. : 


‘ Obsérved Theoretical ! 

Sequence* Prop ortions Proportions 
000 +36 +39 
001 +02 . 0 
010 :Ol (0) 
oll “0 ; 0 
100 27 2h 
‘LOL oO ie) 
1Lo cd ee oh 
111 25 : 23 


* QO = correct response 


1 = error 
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The correspondences are seen to be about as close as could be expected 


with proportions based on 80 response -sequenceés .' 


2.2 Paired-Associate Learning 

In order. to apply the one~element model to paired-associate 
experiments involving fixed lists of itens, it is necessary to adjust 
the “boundary conditions" appropriately. Consider, for example, an 
experiment reported by Estes, Hopkins, and Crothers (1960). The task 
assigned their subjects was to learn associations between the numbers 
1 through 8, serving as responses, and eight consonant trigrams,. serving 
as stimuli. Each subject. was given two practice trials and two test 
trials. On the first practice trial, the eight syllable-number pairs. 
were exhibited singly in a random order. Then a test was given, the 
syllables alone being presented singly in a new random order and. the 
subjects attempting to respond to each syllable with the correct number. 
Then four of the syllable-number pairs were presented on a second 
practice trial and all eight syllables were included in a final test trial. 

In writing an expression for the probability of a correct response 


on the first test in this experiment, we must take account of the fact 





that after the first practice trial, the subjects knew that. the responses 
were the numbers 1 - 8, and were in a position to guess at the correct 
answers when shown syllables that they had not yet learned. The mini- 
mum probability of achieving a correct response to an unlearned item by 
guessing would be 1/8. Thus we would have for Po? the probability of 


a@ correct response on the first test, 
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Po = ot (1-¢)/8 Le 


i.e., the probability ¢ that the correct. association was formed plus the 





probability (l-c)/8 that the association was not formed but the correct 
response was achieved by guessing. . Setting this expression equal to the 
observed proportion of correct responses on the first trial for the twice 
reinforced items, we readily obtain an estimate of .c for these ex- 


perimental conditions, 





«40h = ¢ + (1-c)(.125) 


MI 


a 
c 


+32 


Now we can proceed to derive expressions for the joint probabilities of 
various combinations of correct and incorrect responses on the first 
and second tests for the twice reinforced items. For the probability 


of correct responses to a given item in both tests, we have | 


Pog = € + (1+e)(.125)¢ + (1-e)°(.125)* 


With probability cy conditioning occurs on the first reinforced trial, 

and then correct responses necessarily oceur on both tests; with probability 
(1-c)e(.125) , conditioning does not occur on the first reinforced trial 
but. does on the second and a correct response is achieved by guessing on 

the first test; with probability (tae)? (2125)" » conditioning occurs on 
neither reinforced trial but correct responses are achieved by guessing 


on both tests. Similarly, we obtain 
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kg 
I 


on = (1-0)°(.875) (125) 


“(1-e)( .875)[e¢ + (1-c)(.125)] 


and 


Py = (1-e)*(.875)° 


Substituting for ec in these expressions the estimate computed above, we 
arrive at the predicted values which we compared with the corresponding 


observed values below. 


Observed Predicted 
Pog °3D 6 835 
Po. 205 +05 
Pig. BT 2h 


Although this comparison reveals some disparities which we might hope 
to reduce with a more elaborate theory, it is surprising, to the writers 
at least, that the patterns of observed response proportions in both 
experiments considered can be predicted as well as they are by such 
an extremely simple model. 

Ordinarily, experiments concerned with paired-associate learning 
are not limited to a couple of trials, like those just considered, but 
continue until the subjects meet some criterion of learning. Under these 
circumstances it is impractical to derive theoretical expressions for all 
possible sequences of correct and incorrect responses. A reasonable goal 


is, instead, to derive expressions for various statistics which can be 
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conveniently computed for the data of the standard experiment; examples of 
such statistics are the mean and variance errors per item, frequencies 
‘of runs of errors or correct. responses, or serial correlation of errors 
over trials with any given lag.. Bower (1961) is responsible for the first 
major analysis of this type.. We shall use some of his results to illustrate 
application of the one-element model to a full “learning-to-criterion" 
experiment. 

As a reference experiment for this application, we shall use one of 
Bower's experiments. (1961))}. Essential details of the experiment are as 
follows: A list of ten items was learned by.29 undergraduates to a 
eriterion of two consecutive errorless trials. The stimuli were different 
pairs of consonant letters and the responses were the integers 1 and.2; 
-each response was assigned as correct to-a randomly selected five. items for 
each subject. A response was obtained from the subject on each presenta- 
tion. of an item and he was informed.of the correct answer following his 
response. 

As in the preceding application, we shall assume that each item-in 
the list is to be represented theoretically by exactly one stimulus element 
which.is sampled with probability 1 when the item is presented; and that 
the correct response to that item is conditioned in an all-or-none fashion. 
On trial n of the experiment an element is in one of two "conditioning 
states": In state € the element is. conditioned to the correct response; - 
in state € the element is not conditioned. 

‘The response the subject. makes depends on his conditioning state. When 


the element is in state C , the correct response occurs with probability 1. 
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The probability of the correct response when the element is in state c 
depends on the experimental procedure. In Bower's experiment the subjects 
were told the r responses available to them and each occurred equally 
often as the to-be-learned response. Therefore, we may assume that in the 
unconditioned state the probability of a correct response is 2 » where 

r is the number of alternative responses. 

The conditioning assymptions :can.readily: be.-restatedin'terms of. the 
conditioning states: 

1. On any reinforced trial, if the sampled element is in state ¢, 
it has probability .c of going into state C. 

2. The parameter c is fixed in value in a given experiment.- 

3. Transitions from state C to state @ have probability zero. 

We shall now derive some predictions from the model and compare these 
with observed data. The data of particular interest will be a subject's 
sequence of correct and incorrect responses to a specific stimulus item 
over trials. Similarly, in deriving results from the model we shall only 
consider an isolated stimulus item and its related sequence of responses. 
However, when we apply the model to data we assume that all items in the 
list are comparable, i.e,, all items have the same conditioning parameter 
ce and all items start out in the same conditioning state (C). 
Consequently the response sequence associated with any given item is 
viewed as a sample of size 1 from a population of sequences all generated 
by the same underlying process. 

A feature of this model which makes it especially tractable for 


purposes of deriving various statistics is the fact that the sequences 
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of transitions: between. states (C . and (@ : constitutes a Markov chain. 'This 


means that, given the state on any one trial, we can specify the proba~ 


} 
| 
| 
i 
i 


bility of each state on the next trial without regard to the previous 
history. If we represent by c. and Cc. the events that an item is in 
the conditioned or unconditioned state, respectively, on trial n , and by 
Qa and Gy the probabilities of transitions from state C to state C 


and fron C€ to C » respectively, the conditioning assumptions lead 


2 
directly to the relation 





2 See Feller (1957) for a discussion of conditional probabilities. In 


brief, if Hy» oe oH, are a set of mutually exclusive events of which 

one necessarily occurs, then any event A can occur only in conjunction 
with some 4, . Since the Aq, are mutually exclusive, their probabi- 
lities add. Applying the well-known theorem on compound probabilities, -we 
obtain Pr(A) = aa Pr (AH) = De. Pr(A[H,)Pr(H,) : 

ea a Sd 


Pr(c Ic) zl, 


Gy = ATM 


I, = Pr(¢, ,1¢,) =e 4 
and 
1 ) 


e l-c 


where Q is the matrix of one-step transition probabilities, the first 
row and column referring to C and the second row and column to CG. Now 
the matrix of probabilities for transitions between any two states in n 


trials is simply the n™ power of Q 5 as may be verified by mathematical 
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induction (see, e.g., Kemeny, Snell, and Thompson, 1957, p- 327), 


oe E ie) . 
q" = age a -ei?| ‘ 


Henceforth we shall assume that all stimulus elements are in state Cat 
the onset of the first trial of our experiment. Given that the state is 
C on trial 1, the probability of being in state C at the start: of trial 


n is (ize) 


» which goes to O as n becomes large, for c > 0. 
Thus, with probability 1 the subject is eventually to be found in the 
conditioned state. 

Next we prove some theorems about the observable sequence of 
correct and incorrect responses in terms of the underlying sequence of 
unobservable conditioning states. We define the response random variable 


li if a correct response occurred on trial “n 
A = 


wn 


[2 if an error occurred on trial n 


By our assumed response rule, the probabilities of an error given that 
the subject is in the conditioned or unconditioned state, respectively, 


are 
Pr(A = ufc.) =0 

and 
Pr(A = 1[¢,) = 1 - 


To obtain the probability of an error on trial n, namely 
Pr(A, = 1), we sum these conditional probabilities weighted by the 


probabilities of being in the respective states: 
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il 


Pr(A = 1) = Pr(A, =-1[¢,)Pr(c,) + Pr(A, = 1[C,)Pr(C,) 


(1) 


(1 = S)(a-e)?* 


Consider next the infinite sum of the random variables A, , Ao» Bape 


which we denote A; specifically, 


CO 
=y AL: 


n=1 


tpt 


pit E(B) = Sela.) - Dea, = 


" = (1 = 3)(1-e)"> (2) 
n=. 
=(1-4)/c . 


‘Thus the number of errors expected during the learning of any given item 
is given by Eq. 2. 
Equation 2 provides an easy method for estimating ec. For any 

"given subject we can obtain his average number of errors over stimulus 
items, equate this number to the right-hand side of Eq. 2 with r=2, 
and solve for c. We thereby obtain an estimate of cc for each subject, 
and inter-subject differences in learning are reflected in the variability 
of these estimates. Bower, in analyzing his data, chose to assume that c¢ 


was the same for all subjects; thus he set E(A) equal to the observed 


| 
| 


number of errors averaged over both list items and subjects and obtained 


a single estimate of c. This group estimate of ec simplifies the 





computations involved in generating predictions. However, it has the 


A. and E, -18- : 


disadvantage that a discrepancy between observed and predicted values 
may arise as a consequence of assuming equal c's when, in fact, the 
theory is correct but c¢ varies from subject to subject. Fortunately, 
Bower has obtained excellent agreement between theory and observation 
using the group estimate of ¢ and, for the particular conditions he 
investigated, any increase in precision that might be achieved by 
individual estimates of ce does not seem crucial. 

For the experiment described above, Bower reports 1.45 errors per 
stimulus item averaged over all subjects. Equating E(A) in Eq- 2 
to.1.45, with r=2 , we obtain the estimate ¢ = .344, All predictions 
that we derive from the model for this experiment will be based on this 
single estimate of ec. It should be remarked that the estimate of c 
in terms of Hq. 2 represents only one of many methods that could have 
been used. Which method one selects depends on the properties of the 
particular estimator (e.g., whether the estimator is unbiased and effi- 
cient relative to other. estimators). Parameter estimation is a theory in 
its own right, and we shall not be able to discuss the many problems 
involved in the estimation of learning parameters. The reader is referred 
to Suppes and Atkinson (1960), and Estes and Suppes (1962), for discus- 
sions of various methods and their properties. Associated with this 
topic is the problem of assessing the statistical agreement. between data 
and theory, once parameters have been estimated; that is, the goodness- 
of-fit between predicted and observed values. In our analysis of data in 
this chapter we shall offer no statistical evaluation of the predictions 
put shall simply display the results for the reader's inspection. Our 


reason is that we present 
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the data only to illustrate features of the theory and its application; 
these results are not intended to provide a test of the model. However, 
in rigorous analyses of such models the problem of goodness-of-fit is 
extremely important and needs careful consideration. Here again the 
reader is referred to Suppes and Atkinson (1960) for a discussion of 
some of the problems and possible statistical tests. 

By using Eq. 1 with the estimate of ec obtained above we have 
generated the predicted learning curve presented in Fig. 1. The fit is 
sufficiently close that most of the predicted and observed points cannot 
be distinguished on the scale of the graph. 

Insert Fig. 1 about here 

As a basis for the derivation of other statistics of total errors, 
we require an expression for the probability distribution of A To 
obtain this, we note first that the probability of no errors at all 


occurring during learning is given by 


e(1/r) + (1-e)(1/r)7e eee 


(eo) ‘ 
= o/r >_ [(1-e)/r]> = SIsCies VET =>vb/r, 


i=0 


where b= = This event i ifac tr ecurs 
= T-(l-eyr" event may arise i orrect response occur 
by guessing on the first trial and conditioning occurs on the first 
reinforcement, if a correct response occurs by guessing on the first two 
trials and conditioning occurs on the second reinforcement, and so on. 
Similarly, the probability of no additional errors following an error on 


any given trial is given by 








Pr (Error) 


w 








TRIALS 


The average probability of an error on trial n 





9 10 ll 12 13 


in Bower's paired-associate experiment. 


‘qd pue -y 


~e6T- 
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e+e (1l-c)/r+... 


[e3) . 
= ec a [(1-e)/r]* iar bd. 


=0 


p 


To have exactly k errors, we must have a first error (if k> 0), 
which has probability 1 - v/r, k - 1 additional errors, each of which 
has probability 1- b, and then no more errors. Therefore the required 


probability distribution is 


Pr(A = 0) = d/r 
Pr(A = k) = Hii » for k>1 (3) 


Equation 3 can be applied to data directly to predict: the form of the 
"frequency distribution of total errors. It may also be utilized in 
deriving, e.g., the variance of this distribution. Preliminary to 


computing the variance, we need the expectation of me A 
a 


B(B°) = a ss b(1-b/r)(1-b) > 
i k=0 


ee) 
b(1-b/r) > __ [k(-1)+6](1-b) "+ 
K=O 


co 
(1-b)b(1=b/r) > [ie(k-1)+«] (1-»)* , 
K=0 


W 


where the second step is taken in order to facilitate the summation. 
Using now the familiar expression, 


os 
> (av) =e, 
k=O 


for the sum of a geometric series together with the relations 
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a k k-1 
rin (1-b)” = -k(1-b) 


= (128)* 24k) | 
db | 
and i 
© fee) 
a a k ajly il 
; sp (i-b)" = - ge) (1b) = - FS) = se, 
Lim ad a ‘py’ ~ 12 
fee) 2 2 0 2 
a a kK @ ly 2 
sz (1-b)” = SP (1) = SO eS, 
d- ae ao” k=O ar, 4 
we obtain 
R(R’) = b(2 2 BES + 45] 
b 
and 


var (A) = H(A’) - te(ayi? 
o(i-b/r)[2ESP “31 - (1 - 4)°/0* 


=(1- 2) (2c-crtr-1)/e*r 


(r-1) (2e- sees (r- 1) (ert2c-2er+r-1) -(2-1) 514 (20-1)(1-2) 
re 


re re re ro. a 
= E(A)[1+B(8)(1-2c)] « (is) 


Inserting the estimates E(A) = 1.45 and c= .344 from Bower's data 
in Eq. 4, we obtain 1.4). for the predicted standard derivation of total 
errors, which may be compared with the observed value of 1.37. 


Another useful statistic of the error sequence is E(A A 4} 





namely, the expectation of the product of error random variables on 


trials n and ntk. This quantity is related to the autocorrelation 
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between errors on trials n+k and trial n. By elementary probability 


theory, 


EA Bask) 


Ml 


E 


(aaa B(A,) 


= Pr(A ik = 1A. = 1)Pr(A, =l1). 


But for an error to occur on trial ntk it must be the case ‘that 


conditioning has failed to 


occur during the intervening k . trials and 


that the subject guessed incorrectly on trial ntk. Hence 


Pr (A tk = 


Wg, =H) = (e)F(1 - 2) 


Substituting this result into the preceding expression, along with the 


result presented in Eq. 1, 


BQ Bene) 


lf 


A convenient statistic for 
average autocorrelation of 
is obtained by summing the 


trials. We define c,. as 


yields the following expression: 


-1 1 
(1 - £)(2-0)®(1-0)97*(1. - 2) 


nt+k-1 


2)*(1-0)"* E> (5) 


comparison with data (directly related to the 
errors with lag k, but easier to compute) 
eross product of én and Ant over all 


the mean of this random variable, where 


Me 


E (Bntibin ) 


Ul 
H 


n 


Ml 


B(A)(1 = 3)(1-0)* . (6) 


To be explicit, consider the following response protocol running in time 


from left to right: 1101010010000. The observed values for c. are 


k 


e, = 1, ¢, = 2, c, = 2,.. and so on. 
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The predictions for c Coy and c¢ computed from the c..estimate 


Vv 3 
given above for Bower's experinient were .479,...310, and .201.. Bower's 
observed values were .486, .292, and .187. 

Next we consider the distribution of the mumber of errors between 
the kth and ktlst success. The methods to be used in deriving this 
result are general and can be used to derive the distribution of errors 
between the kth and ktmth success for any non-negative integer m. 

The only limitation is that the expressions become unwieldy as m 
increases. We shall define De as the random variable for the number 
of errors between the kth and k+lst success; its values are O,1,25 00 
An error following the kth success can only occur if the kth success 
itself occurs as.a result of guessing; that is, the subject necessarily 
vis in state C when the kth success occurs: Letting &, denote the 
probability that the kth success oceurs by guessing, we can write the 


probability distribution 


1-@ & for i-=0 
P(g, = 4).= ; (1) 
(1-a)a &, for i>o0 
where @ = (l-c)(1 - *) - .To obtain Pr(d, 


can occur. in one of three ways: .(1) The kth success occurs because the 


= 0) we note that O errors 


subject is in state C (which-has probability 1-8.) and necessarily a 
correct response occurs on the next trial; (2) the kth success occurs 
by guessing, the subject remaining in state Cc and again guessing cor- 
rectly onthe next trial- [which has probability g,, (1-¢)(2)J A or 


(3) the kth success occurs by guessing but conditioning is effective 
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on the trial (which has probability Bc) + Thus Pr(J,. =O) =l- B 
+ &,(1-e)(=) + gc =1l-~a6,.. The event of i errors (i> 0) bet- 


ween the .kth and k+tlst successes can occur.in one of two ways: 





(1) The kth and k+lst successes occur by guessing [with probability 


g,,(2-¢) (1 - aie =] or (2) the kth success occurs by guessing and 


Yr 


conditioning does not take place until the trial immediately preceding 
the. k+tlst success [with probability g, (2-¢) "(2 - 2)*e] « Hence 


Ml 2 byt 2 eg (1-e)41.- B)%e 


Pr(J,, = i) = @,(2-c) A 


Yr 


= (2 - Z)"(1-e)* [e + E(-e)] = @, a (1-0) . 





From Eq. 7 we may obtain the mean and variance of J, , namely 








wk 
ee) a &y 
E(J,) = oe Pr(Jy, = i) = Ta ? (8) 
i=0 : 
and os 
var(3J,,) = > Pr(J, =i)- B(I,,)° 
1=0 
a &, (1+) oe, 
2 — 3 (9) 
(1-a) (1-a) 
a 
= aoe + a(l - &,)] 7 


In order to evaluate the quantities above we require an expression 
for & « Consider 8 ,» the probability that the first success occurs 
by guessing. It could octur in one of the’ following ways: (1) ‘The 


subject guesses correctly on trial 1 (with probability = ) or (2) 





the subject guesses incorrectly on trial 1, conditioning does not occur, 


and the subject guesses successfully on trial 2 [this joint event having 
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probability (1 - 2)(1-c) 2] or (3) conditioning does not occur on trials 
1 and 2, and the subject guesses incorrectly on both of these trials but 
guesses correctly on trial 3 (with probability (1 - 2)*(a-c)* 4) 


so forth. Thus 


gy = E+ (1 - Za-c)e + (2 - (re Fe 
a 7m (1 - S)\(a-e)* = 1/(2-a)r 


Now consider the probability that the kth success occurs by guessing for 
k>1. In order for this ‘event to occur it must be the case that (1) 

the k-lst success occurs by guessing, (2) conditioning fails to occur 
on the trial of the k-lst success, and (3) since the subject is assumed 
to be in state C on the trial following the k-lst success, the next 


correct response occurs by guessing with probability 8) . Hence, 


Bie = Bey (1-2 ay 


3 


Solving this difference equation” we obtain 





3 The solution of this equation can quickly be obtained. Note that 
& = 8, (1-c)g, = (a-c em Similarly = g,(l-c)g,; substituting 
a = By 1 1 » 83 = Bo 1? 

the above result for g, we obtain &3 = (1-0 ae(1-c Je, = (1-0 )23 . 


If we continue in this fashion it will be obvious that & = (1-0 )S7@, 





ke1 
&, = (lee) "ey 


Finally, substituting the expression obtained above for By ‘ylelds 


a, = (1-c Bt (r -arye , (10) 
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We may now combine Eqs. ,7: pen 10, inserting our. original estimate 
of | is » to obtain pemaiehions about the number of.errors between the. 
kth and k+lst success in Bower's data. To illustrate, for k = 1, 
the predicted mean is .361 and the observed wads: is +350. 

To conclude our analysis of this model, we consider the probability 
Py that a response sequence to a stimulus item wilt exhibit the property 
of no errors following the kth success. This event can oceur in one 
of two ways: (1) The “oP. success occurs when the subject is in state 
C [which we have already calculated to be 1-g,] » or (2) the kth 
success occurs when the subject is in state C and no errors occur on 
subsequent trials. Let b denote the probability of no more errors 


following a correct guess. ‘Then 
Pp, = (1-8) + ad 
sl1l- g,(1-b) i ~ (11) 
But the probability of no more errors following a successful guess is - 
simply 


beet (1-c)= c+ (1-e)*(4 a 


~ c 


Qa+te 





Substituting this result for b into Eq. 11, along with our 


“expression for a in Eq. 10, we obtain 


Si Rese ee . (12) 


(a+ ¢)(r-ar)* 


PP, 
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Observed and predicted values of Py for Bower's experiment are shown 


in Table 2. 





Insert Table 2 about here 


We shall not pursue more consequences of this waar,” The particular 


Bower also has compared the one-element model with a comparable 
single-operator linear model presented by Bush and Sternberg (1959). 
The linear model assumes that the probability of an incorrect response 
on trial n isa fixed mmber p, where Pal? (1-¢ )p, and 
P) = (1- +) . ‘The one-element model and the linear model gnerate many 
identical predictions (e.g., mean learning curve) and it is necessary to 
look at the finer structure of the data to differentiate models. Of the 
20 possible comparisons Bower makes between the two models, he finds 


that the one-element model comes closer to the data on 18. 





results we have examined were selected because they illustrated 
fundamental features of the model and also introduced mathematical 
techniques which will be needed later. In Bower's paper, more than 30 
predictions of the type presented here are tested, with results comparable 
to those exhibited above. The goodness-of-fit of theory to data in these 
instances is quite representative of that which one may now expect to 
obtain routinely in simple learning experiments when experimental 
conditions have been appropriately arranged to approximate the simplifying 


assumptions of the mathematical model. 
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Table 2 





Observed and predicted values for Py . the probability of no errors 
following the th success. (Interpret a5 as the probability of no 


errors at all during the course of learning). 





k Observed p, . Predicted p, 

0 +255 +256 

1 -628 636 oo 
2 812 822 
3 “869° 912 

y 928 -957 

2) +963 ; “979 

6 973 «990 
.990 995 . | 
8. +990. “997 
9 «993 aa - + 998 

1o +996 +999 


11 1.000 1.000 
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Concepts of the sort developed in this section can be extended to 
more traditional types of verbal learning situations involving stimulus 
similarity, meaningfulness, and the like. For example, Atkinson (1957) 
has presented a model for rote seuiet learning which is based on similar 
ideas and deals with such variables as intertrial interval, list length, 
and types of erros (perseverative, anticipatory, or response-failures). 
Unfortunately, theoretical analyses of this sort for traditional , 
experimental routines often lead to extremely complicated mathematical 
models with the result that only a few consequences of the axioms can be 
derived. Stated differently, a set of concepts may be very general in 
terms of the san of situations to which it is applicable; nevertheless, 
in order to provide rigorous and detailed tests of these concepts, it is 
frequently necessary to contrive special experimental routines where the 


theoretical analyses generate tractable mathematical systems. 


2.5 Probabilistic Reinforcement Schedules 

We shall now examine a one-element model for some simple two-choice 
learning problems. The one-element model for this situation, as 
contrasted with the paired~associate model, generates some predictions 
of behavior which are quite unrealistic and for this reason we defer an 
analysis of experimental data until we consider comparable multi-element 
processes. The reason for presenting the one-element model is that it 
represents a convenient introduction to multi-element models and permits 
us to develop some mathematical tools in a simple fashion. Further, when 
we. do discuss multi-element models we shall employ a rather restrictive 


set of conditioning axioms. However, for the one-element model we may 
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present an extremely general set of conditioning assumptions without 
getting into too much mathematical complexity. Therefore, the analysis 
of the one~element case will suggest lines along which the multi-element 
models can be generalized. 

The reference experiment (see, e.g., Estes and Straughan, 195}; 
Suppes and Atkinson, 1960) involves a long series of discrete trials. 
Bach trial is initiated by the onset of a signal. To the signal the 
subject is required to make one of two responses which we denote A 
and. Ay . The trial is terminated with an BE, or E, reinforcing event; 
the occurrence of Ey indicates that response A, was the correct 
response for that trial. Thus in 4 human learning situation the subject 
is required to predict on each trial which reinforcing event he expects 
will occur by making the appropriate response--an AL if he expects E 


1 
and an A if he expects E,3 at the end of the trial he is permitted to 


2 2 
observe which event actually occurred. Initially the subject may have no 
preference between responses, but as information accrues to him over trials, 
his pattern of choices undergoes systematic changes. The role of a model is 
to predict the detailed features of these changes. 

The experimenter may devise various schedules for determining the | 
sequence of reinforcing events over trials. For example, the probability 
of an E, may be (1) some function of the trial number, (2) dependent 
on previous responses of the subject, (3) dependent on the previous \ 
sequence of reinforcing events, or (4) some combination of the above. 


For simplicity, we consider a noncontingent reinforcement schedule. The 


case. is defined by the condition that the probability of E) is constant 
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over trials and independent of previous responses and reinforcements. It 


is customary in the literature to call this probability x 3; thus, ' 


Pr(E ) =m forall n . Here we are denoting by E the event 
i,n _ ijn 
that reinforcement Ey oceurs on trial n . Similarly, we shall 
represent by A the event that response A, occurs on trial n . | 


£ 
We assume that the stimulus situation comprising the signal light 


i,n 


and the context in which it occurs can be represented theoretically by a ‘ 
single stimulus element which is sampled with probability 1 when the 
signal occurs. At the start of a trial, the element is in one of three 


conditioning states: In state Cy the element is conditioned to the AL 


response and in state (C, 


2 to the Ay erence in state Cy the 


element is not conditioned to either A or Ay - The response rules 


are similar to those presented earlier. When the subject is in Cy or 


Cy » the A or Ay response occurs with probability 1. In state C 


6) 
we assume that either response will be elicited equiprobably; that is, 
iL 


onl Con) as For some subjects a response bias may exist which 


Pr(A, 


would require that we assume Pr(A, ) =B8 where Bp #5 . .For 
3 


| Con 
these subjects it would be necessary to estimate ®8 in applying the 
model. However, for simplicity we shall only pursue the case where 
responses are equiprobable when the subject is in & . | 
We now present a general set of rules governing changes in 
conditioning states. As the model is developed it will become obvious 
that for some experimental problems restrictions can be imposed which 
greatly simplify the process. 


If the subject is in state Cy and an E, occurs (i.e., the subject 


L 
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makes an AL response which is correct), then he will remain in Cy . 


However, if the subject is ‘in Cy and ‘an Ey occurs, then with 


probability e¢ the subject goes to C, and with probability c' to 


2 
Cy + Comparable rules apply when the subject is in Cy » Thus, if the 
subject is in Cy or Cy and his response is correct, he will remain 
in Cy or Cy - If, however, he is in Cy or Cy and his response is 


not correct, then he may shift to one of the other conditioning states, 
thereby reducing the probability of repeating the same response on the 
next trial. 


and an E or #£ 


Finally, if the subject is in c : 


‘0 oceurs, 


then with probability ce the subject moves to Cy or Cy ’ pespeobively.” 


5 


‘Here we assume that the subject's response does not affect the change. 


That is, if the subject is in Co and an Ey occurs, then he moves to 


cy with probability ec" independently of whether Ay or Ay occurred. 


This assumption is not necessary and we could readily have the actual 


" 


response affect change. For example, we might postulate Cy for an 





AE) or AnE, combination, and C5 for the AVES or A,B, 


combination; that is, Pr(Cy ial E. nAt non) = Pr(Cy ae | By nhs no,n) = 
" ~ : — ar 

ey ane -Brley avs | By Ae non) = Pr(Cy nel | ®t nfo) = where 

ey # ey . However, such additions make the mathematical, process, more 


complicated and should be..introduced only, when the. data clearly require 


them. : 


Thus, to summarize, for i,j =1,2 and i 7 os 
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Px(Cs nar FE on ger) a 
i t m 
Pr(Cy 42 FBs C, ) maa 
(13) 
Pr(Cs na !B5 in Can) = ° 
= " 
Pr(Cy ner Bn Con) sae 


where O< ec" <1 and O<e#e'’<1 . 

We now use the assumptions of the preceding paragraphs and the 
particular assumptions for the noncontingent case to derive the transition 
matrix in the conditioning states. In making such a derivation 4% is 
convenient to represent the various possible occurrences on a trial by a 
tree. Each set of branches emanating from a point represents a mutually 
exclusive and exhaustive set of possibilities. For example, suppose that 
at the start of trial n the subject is in state c » then the tree in 
Fig. @ represents the possible changes that can occur in the conditioning 
state. 


Insert Fig. 2 here 





. The first set of branches is associated with the reinforcing event 


on trial n . If the subject is in Cc, and an E, occurs, then he 


1 


will stay in state C, on the next trial. However, if an E, occurs, 


i: 2 
then with probability c¢ he will go to ce, » with probability c' he 


will go to C, , and with probability il-c-c' he will remain in Cy ‘ 


(0) 
Each path of a tree, from a beginning point to a terminal point, 


represents a possible outcome on a given trial. The probability of. each 
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1,n+l 





Lyn Ce nl 
Co yntl 
ntl 
Fig. 2. Branching process, starting from state C¢ on trial on, 


for one element model in two choice, noncontingent. Case. 
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path is obtained by multiplying the appropriate conditional probabilities. 
Thus, for the tree in Fig. 2 the probability of the bottom path may be 

1 ynea Bo nan 
four paths, two lead from Cy to c) 3 hence 


represented by PXE, .|C, _)Pr(c ) = (1+n)(1-c-c'). Of the 
2,n'-1,n 


Py = PAC |c = n+ (l-x)(1-c-c') . 


1,n+1 yn) 


sie Ph fe . 
Similarly, Po = (1-x)c' and Pip = (l-x)c , where Pi; denotes .ithe 


probability of a one-step transition from c. to ¢ 


Jj 
For the Cy state we have the tree given in Fig. 3. On the top 


branch an E) event is indicated and by Eq. 13 the probability of going 


Insert Fig. 3 here 


to C, is ec" and of staying in C¢ 


ag co] 


holds for the bottom branches. ‘Thus we have 


is l-c". <A similar analysis 


Po FNS 

: as 7 w 

Pop = (1-x)e 
= leo 

Pog = 1-e 


Combining these results and the comparable results for Cc, yields the 


following transition matrix: 


Cc, Cy c, 
c, |1- (1-x)(c'#c) e'(1-x) e(1-x) 
P= C, ax i-c" e"(1=) (14) 


Cy ex : e'n l-n(e'+c) 





i 
: 
H 
} 
i 
i 
i 
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Ch nal 





O,nth 
“o,n 
2 ntl 
O, n+l 
Fig. 3. Branching process, starting from state C on trial n, 


for one element model in two choice, noncontingent Case. 
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As in the case of the paired-associate model, a large number of 
predictions can be derived easily for this process. However, we shall 
only select a few which are useful in clarifying the fundamental properties 
of the model. We begin by considering the asymptotic probability of a 
particular conditioning state and, in turn, the asymptotic probability of 
an A response. The following notation will prove useful: Let 
[p, 5] be the transition matrix and define ay n) as the probability of 
being in state j on trial rin, given that at trial r the subject 


was in state i. The quantity is defined recursively: 


cn Ee Gal) = ene) 


Pay Pag iv Pay 


Moreover, if the appropriate limit exists and is independent of i , we 


set 


(n) 


u, = lin Pi; 


n-© 


The limiting quantities a, exist for any ‘inieacdeane Markov chain 
that is irreducible and aperiodic. A Markov chain is irreducible if there 
is no closed proper subset of states; that is, no proper subset of states 
such that once within this set the probability of leaving it is O . For 


example, the chain whose transition matrix is 


1 2 
1 | 3/4 1/h 
eabelfes 2/2 


3 L1/3. 1/3 
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‘is reducible because the set {1,-2} of states is a proper closed subset. 
A Markov chain is aperiodic if there is no fixed period for return to 


any state, and periodic if a return to some initial state j is impossible 


except at t, et, 3t, «.. trials for t > 1. ‘Thus the chain whose 
matrix is 
1 2 3 
iL 0 ab 0 
2 e) (0) 1 
2 1 0 0 


has Heitiod. € = 3 for return to each state. 

if there sare r states, we call the vector wu =[ujstps +p] the 
stationary probability vector of the chain. It may be shown [Feller (1957), 
Kemeny and Snell (1959)] that the components of this vector are the solutions 


of the r linear equations 


Uy = 2 WPyp (15) 


WH 


ah 


a ae ee 


vel 


bY 
such that y us = 1. Thus to find the asymptotic probabilities uy; 
vel 


of the states, we need find only the solution of the r equations. The 
intuitive basis of this system of equations seems clear. Consider a two- 


state chain. Then the probability of being in state 1 on 


Prt 





A. and EB. -36- 


trial n+l is the probability of being in state 1 on.trial -n. and going 
to 1 plus the probability of being in. state 2 on trial. n and going to 1; 
that is 

Pai 7 Puy t Po, (-p,) - 


But at asymptote Poet = Py = 4 and 1 - P, = Up » whence 


Uy =-Pyity + Ports 
which is the first of the two equations of the system when r=2 . 

It is clear that the chain represented by the matrix P of Eq. 14 
is irreducible and aperiodic; thus the asymptotes exist and are indepen- 
dent of the initial probability distribution on the states. Let 
[p; 5] (i,j = 1,2,3) be any 3x3 ¢rensitdion matrix. Then we seek. the 
numbers u 


J d 


such that u, = > nea and > wy = 1. The general 
oe. v aoe 
solution.is given by u; = D,/D where 


1 = P51 (1 - Pop) + PoyPas 
2 Sd P3iPi2 + Pap (1 = Py) (16) 
Ds = (1 - py,)(1 - Ppp) = Por Pip 


D= Dy +D; + D5 . 


Inserting in these equations the equivalents of the Pa; from the 


transition matrix and renumbering the states appropriately we obtain 


DL = xe"(c + ¢'n) 
Dg =.n(L =. aje'(et + 2c). 
Do =.= x)e"Lo + e(l-ax)]. 
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Since D is the sum of the pie and sincé ‘u, = D,/D we. may divide \ 


the numerator and denominator by (ote and obtain 





es n[otex] 
1 xloten] + x(1l-a)elet2p] + (1-2) [pte(1-x)] 
(17) 
ae n(l-n)elet20] 
0 glotest] + x(1-n)eler2p] + (1-n)[pte(1-x)] 
Up eee Uy ey 
: 7 |, 
where o= oF and e=S * 
By our response axioms we have 
Pr(A, .).= Pr(c, .) +# Pr(c,_) 
ayn: “nee, "O,n 
forall n. , Hence 
iin Pr(A, .) =u, + s Up 
n> oo of 
n[ptep + $e] + [e-ep- He ] 
= (18) 


2 
nle +2ep-2c] + x [2e-e~2eplipte 


An inspection of Eq. 18 indicates that the asymptotic probability of 


‘an Ay response is a function of x, p, ande« . As will become clear later 





the value of Pr(A, 5) is’ bounded in the open interval from z to 
2 
Ae : 3 whether Pr(A, a) is above or below «a depends on the 
20 - 
co (1-n)* 





Es 


values of p and e¢. 
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We now consider two special absent our one-element model. The first 
case is comparable to the multi-element models to be. discussed later, vous 
the second. case is, in.some respects, the complement of the first case. 

ia Case of c' = 0. Let us rewrite Eq. 14 with c' = 0.. Then the tran- 


sition matrix will have the following. canonical. form: 


Cee C 


Le 2 Cy 
Cy 1-ec(1 - x) e(1 - x) ) 
P=C, cn 4s on ce) (19) 
Co o'n e"(1 - x) 1 -c" 


We note that once the subject has left state C. he can never return.. In 


n-1 


fe) 


—-P n 
fact it is obvious that Pr(Cy ,) =Pr(Co pa ec") where Pr(Co 4) is 


the initial probability of being in C Thus, except on early trials, C 


0.” ie) 
is not part of the process and the subject in the long run fluctuates between 
Cc) and C5 > being in Cc, on a proportion «x ‘of the trials. 


From Eq. 19-we have also 


= - an 1 
Pr(Cy. ya) = Pr(cy 2 e(l-n)] + Pr(C, ) en + Pr(Co .,) en . 


That is, the probability of being in Cy 


probability of being in Cy on trial n. times the probability. Pay of 


on trial n+1 is equal to the 


going from Cy to cy plus the. probability of being in Cy times Poy 


plus the probability of being in Cc times For simplicity let 


) Po * 
x =FPr(C, ) ¥,, = Pr(Cy .) and z= Pr(Co .,) - Now we know that 
2, = 2,(1-c")"t 
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ya-k.. 


and also that x, +y, +2, = 1 or a ace Se 2, (1 - oF 
Making these substitutions in the recursion above yields 
n-1 


‘Xo. = x, [1 -e(1 -x)] + 2,0" (1 - eo") 


" nel 
n+1 + en[l - x7 2, (1 -e"y"] 


(1 - on yoni n(e" -~c) ton. 


a 


x, (1 ~c) + 25 


This difference equation has the following asiionrs 


The solution of such a difference. equation can readily be obtained. 
. Consider Xny 7 OX, + pet +d where a, b, c and d are constants, 


Then 
(1) %y = ax, + btda. 


Similarly *e = aX, + be +4 and substituting (1) for x we obtain 


(2) X35 ax, +abt+ad+bera. 


Similarly X= ax, + bee + a and substituting (2) for x, we obtain 


3 3 


(3) x, = ax, + a°b + aa + abo + ad tbe +a. 


If we continue in this fashion it will be obvious that for n> 2 


n-2 , n-2 . 
xX = oe, +a> aay 2. ey : 


y 1=0 








A. and E, -40- 


Carrying out the summations yields the desired results. See Jordan (1950, 


p. 583-584) for a detailed treatment. 





x, m= (n= x) ce) way f(a = ot * - erry 


Pe(A ) Sa eta = PXCy 1) BG, (2 - eye 
, (20) 
enyat 


-B(Cy Mx - 1/2)(1 - 


If Be(Cy 1) = 0 then we have a simple exponential learning function starting 
at Pe(C, 4) and approaching a ata rate determined by c. If Pr(Cy 1) #0, 
then the rate of approach isa function of both c and ‘c". : 
We now consider one simple sequential prediction to illustrate another 
feature of the one-element model for c'’=.0. Specifically, consider the prob~ 


ability of an AL response on trial n+1 given a reinforced AL response on 


trial n; namely Pr(A 





ayn Fan An) Note first-of all that 


BAD ar |B on Ay nF (En Ay in} BAL nal En Ayn): Further we may write 


Pr(Ay 42 En AL in) x 25 Am Cy nel Bon “Wyn ee) 
a 





ad = Saat, ea! aa BL n A n ay nrFr(C, | wil EL n ay, n Cin ) 
od 





Bela A tay, nie ds nFe(C jn)" 
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But by assumption the probability of a response is determined solely by the 
) = 


Further, by assumption the probability of an Ey event 


conditioning state, and hence Pr(A E 


rns Oy net 1,n Alin Cin 


Pr(4, | nv! x, ner) 


is independent of other events, and hence Pr(E [A ) =a. Substi- 


lyn ln Con 


tuting these results in the above expression we obtain 


Pr(Ay naa Fan Al in) wee pal Pr(Ay nails nat) Pr(Cy yr FBy AL no 5 yn) 


i,j 


» Pr(A Cc, Pric, 
n( al jon) ( jm) 


Both i and j run over 0, 1 and 2 and therefore there are nine terms 
in the sum; but note that when either i or j is 2 the. terms 

Pr(A, | nt! or ni? 
suffices to limit 1 and j to O and 1, and we have 


and Pr(A, |C, _) both equal 0. Consequently, it 
La’ g,n 


Pr(A E 


1,ntl “l,n Ayn) 


i 


=> Pr(Ay ner!Cy ner) Pr(Cs ast hn A 3) Pr(Ay lc 


120 1,n “i,n ) Fr(C, |) 


1,n 


1 
+ é . 
ui 2. Pr(Ay yi lCs na PECs na FEL Ayn Son PPrlAy, IC, PCC) 


Since the subject. cannot leave state C is reinforced, 


1 
)=1 and pr(c 


on.a trial. when AL 


we know that Pr(c E 


1m Br Ar nin 
Ic 


= Q; 


o,nta! 1,1, 2°10) 


es Pr(Ay nad nel) = 1. Therefore, the first sum. is simply wPr() 
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9 2 Sonia tt 
For the second sum Pr(Cy nat !By AD Won) = ce" and 
L 
i = " a) 
Pr(Cy 41 !By Ay n°o,n) =1l-c". Further Pr(Ay nei! Co ) = j-hence 
forutheisecond sum we obtain 
wt z " 1 
x[c sda - ce") alPr(Cy .,) : 
Combining these results 
pr(A. = x{Pr(oy +dupr(c, )fo" + (1 - ee} ; 
i,ntl 1, Ay, n) 2 “On : 2 
But Pr() A nA, Pe, = Pr(By 1A, ,)Pr(Ay, y) = aPr(Ay ,) whence 
pr(c, +d prc, )fo" + (1-0") 3] 
Pr(A [Ba )e- wine “"O,n 2 
Lath ijn", a oon 
Pr{A) .) 
el 


We know that Pr(Cy w) and Pr(A, n) both approach x in the limit 
2 3 














and that Pr(Cy Pt) approaches 0. Therefore, we predict that 
cS 
lim Pr(A, nt By Ad n) =1. 
n-~-@ 


This prediction provides. a very sharp test for this particular case of 
the model and one that is certain to fail in almost aiy experimental situa- 
tion. That is, even after a large number of trials it is hard to conceive 
of an experimental procedure such that a response will be repeated with 
probability 1 if it occurred and was reinforced on the preceding trial. 
Later we shall consider a multi-element model which seyhass an excelent 
description of many sets of data but is based on essentially the same condi- 


tioning rules. specified by this case of c'=0. It should be emphasized that 
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deterministic predictions of the sort given in the equation above are 
peculiar to one-element models; for the multi-element case such difficul- 
ties do not arise. This point will be amplified later. 

Case of c = 0. We now consider the case in which disect counter- 
conditioning does not occur, i.e., ¢ = 0, and thus p=0 and O< e€< om. 
With this restriction the chain is still ergodic since it is possible 
to go from every state to every other state, but transitions between 


C, and Cy must go by way of Cy - Letting p = 0 in Eq. 18 we obtain 





a + #e(2 - xe 
Pr(A )= 


1,00 2 


ee (21) 
xn +x(1 - ale + (1 - x) 


From Eq. 21 we can draw some interesting conclusions about the 


relationship of the asymptotic response probabilities to the ratio 


1 
‘= 7 . Differentiating with respect to ¢., we obtain 
; 1 
5 n(1-x)(5 - x) 
<— Pr{A 2 —COF 
de 1,00 fe + fie + n(l-n)e]o 


If x(-x) (5 - a) # O(ive., af 3) then Pr(Ay 40) has no maximum 
for ¢ in the open interval (0,00), which is the permissible range on €. 
In fact, since the sign of the derivative is independent of ¢« we know 
that Pr(Ay 4) is either monotone increasing or monotone decreasing in ¢€ : 
strictly increasing if x(1-x) (5 - 1) > 0 (i-e., «> 3) and decreasing if 
x(L-) (=m) <0 (i-e., a < 3). Moreover, because of the monotonicity of 
Pr(Ay 4.) ine , it is easy to compute bounds from Eq. 21. Firstly, we 


see immediately that the lower bound (assuming x > 3) is lim Pr(A, oo) B . 
E90 2 . 
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2 
Secondly, when ¢ is very small, pr{A ) approaches ——_*__-~ , Note, 
’ 15.0 ne P Gee 

however, that Eq. 21 is inapplicable when ¢ = 03 for if both c= 0 and 
ce! = 0, the transition matrix (Eq. 14) reduces to 

1 6) 0 

Pao 4 e"g ‘ lee" eo" (le) y 
6) e) 1 


and if the process starts in C Pr(A, co? =a. But for ¢€>O, if 
a 


oO o 
xt > - Pr(A, eS) is a decreasing function of ¢ and its values lie in 
“3 


the .half open interval 


2 


< —=——, . 
a + (1en)* 


el Fg 
iA 
x 
> 
fa 


It.is readily determined that probability matching would not be predicted 
in this case. When S is greater than 2 , the predicted value of 
Er(Ay oy) is less than wa , and when this ratio is less than 2 , the 
predicted value of Pr(Ay 4) is greater than az. 


Finally we derive Pr{A | ) for this case. The derivation 


: tL nh,yn 


: L,ntl 
is identical to that given for the case of c' = 0. Hence 


uy + s uote" + (1-c") #] 
lim Pr(A ee 


n —.0O 


rot E Ad, n ay 
Lo fe 
£20 


Note however that for c= 0, the quantity Uy is never © (except for 


x = 0,1) , and consequently Pr(A 


1, ng!® 1, nAd a) is always less than ae 
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Contingent reinforcement. As a final example we shall apply the 
one-element model to a situation where the reinforcing event on trial n 
is contingent on the response on that trial. Simple contingent reinforce-~ 


ment is defined by two probabilities a, and %, such that 
P(E, lA. on) Be cris PHB) Ay) = My ° 


We consider the case of the model in which c' =0O and Pr(C, D = 0. 
= 


That. is, the subject is not in state C, on trial 1 and (since c' = 0) 


0 


he can never reach Cy from Cc or C,. Hence, on all trials he is in 
Cy or Cy) and transitions between these states are governed by the 
single parameter c. The trees for the cy and C5 states are given 


in Figure 4, 


Insert Fig. 4 about here 


The transition matrix is 


ee C 
cy 1-(L-x,,)e (1-1, ,)¢ 
P= ; 
cy Choy L-cn) 


and in terms of this matrix we may write 


Pr(Cy a) - Pr(cy ,){1-(1-1,,)e) + Pr(Cy ,)em a7 . 
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Ch neh 


Ce nal 





“1, nt 


onal 


C3 nal 





yntl 


Fig. 4. Branching process for one element model in two-choice, 
contingent case. 
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But Pr(C, ) ‘= 1-Pr(C) ,) and Pr(C, ,) = Pr(A, |.) 


hence 


Pr(A )= Pr(a, ,,)UeG+a Je = ext 


L,nth ai) + epy + 


This difference equation has the solution 


uy n=1L 
Pr(A, .) = Pr(Ay o,) - [Pr(A, .,) - Pr(Ay 1) ][1-c(1-x,,+"5))] 
where 
8 
21 
Pr(A } 2. 
1;,00 dems 4% 57 


. The asymptote is independent of c¢ , and the rate of approach is 
determined by the quantity e(1-n, yt5,)- _It..is: interesting to.note that 
the learning function for Pr(A, | in this case of the one-element model 

2 


is identical to that of the linear model (cf. Estes and Suppes, 1959«.)- 
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3. MULTI-~ELEMENT PATTERN MODELS 
3.1 General Formulation 

In the literature of stimulus sampling theory a variety of proposals 
have been made. for conceptually representing the stimulus situation. Funda- 
mental to all of these suggestions has been the distinction between pattern 
elements and component elements, For the one~element case this distinc- 
ton does not play a serious role, but for multi-element formulations 
these alternative representations of the stimulus situations specify 
different mathematical processes. 

In component models, the stimulating situation is represented. as a 
population of elements which the learner is viewed as sampling from trial 
to trial. It is assumed that the conditioning of. individual elements to 
responses occurs independently as the elements are sampled in conjunction 
with reinforcing events, and that the response probability in the pres- 
ence of a sample containing a number of elements is determined by an 
averaging rule. The principal consideration has been to account for 
response variability. to an apparently constant stimulus situation by 
postulating random fluctuations from trial to trial in the particular 
sample of stimulus elements affecting the learner. These component 
models have provided a mechanism for effecting a reconciliation between 
-the picture of gradual change usually exhibited by the learning curve 
‘and the all-or-none.law of association. 

For many experimental, situations a-detailed account of the quanti- 
tative, properties of learning can.be given by. component models. that 
assume discrete associations between responses and the independently 


+ 
variable elements of a stimulating situation. . However,.in some cases 
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predictions from component models fail, and it appearsi:that a.simple 
account of the learning process requires thé assumption that responses 
become associated, not with separate components or aspects of a stimulus 
situation, but with total patterns of stimulation considered as units. 

The model. presented in this section.is intended to represent such a case. 
In it we assume that an experimentally specified stimulating situation 
can be conceived as an assemblage’ of distinct, mutually exclusive patterns 
of stimulation, each of which becomes conditioned to responses on an 


all-or-none basis.. By "mutually exclusive" we mean that exactly one of 


the patterns occurs (is sampled by the subject) on each trial. By "distinct" 


we mean that no generalization occurs from one pattern to another. ‘Thus 
the clearest experimental interpretation would involve a set of patterns 
having no common elements (i.e., common properties or components). In 
practice the pattern model has also been applied with considerable success 
to experiments in which the alternative stimuli have some common elements, 
but nevertheless are sufficiently discriminable so that generalization 
effects (esg., “confusion errors") are small and can be neglected: without 
serious error. 

In this presentation we shall limit consideration to cases in which 
patterns are sampled randomly with equal likelihood so that, if there 
are N patterns, each has probability = of being sampled on a trial. 
This sampling assumption represents only one way of formulating the model 
and is presented here because it generates a fairly simple mathematical 
process and provides a good account of-a variety of experimental resuits. 


However, this particular scheme for sampling patterns has restricted 
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applicability. For eshauiey ai certain experiments it can be demonstrated 
that the stimulus array to which the subject responds is in large part 
determined by events on previous trials; that is, trace stimulation asso- 
ciated with previous responses and rewards determine the stimulus pattern 
to which the subject responds. When this is the case, it is necessary to 
postulate a.more general rule for sampling patterns than the random scheme 
proposed above. (e.g., see the discussion of "hypothesis models" in 
Suppes .and..AtKinson, 1960). 

Before stating the axioms for the pattern model to be considered in 
this section we define the following notions. ‘As before, the behaviors 
available to the subject are categorized into mutually exclusive and exhaus- 
tive response classes (Ay ,Ags ++ AL) The possible experimenter-defined 
outcomes of a trial (e.g., giving or withholding reward, unconditioned: 
stimulus, knowledge of results) are classified by their effect on response 
probability and are represented by. a-mutually ‘exclusive .and exhaustive set 
of reinforeing events (Eg Ej, +++E,,) . The event E, (440) indicates 
that response A, is reinforced and Ey represents any trial outcome 
whose effect is neutral (i.e., reinforces none of the A,'s ). The 
subject's response and the experimenter-defined outcomes are observable, 
but the occurrence of Ey is a purely hypothetical event that represents 
the reinforcing effect of the trial outcome. Event Ej is said to have 
occurred when the outcome of a trial is such as to increase the probability 
of response A, in the presente of the given stimulus--provided, of course, 
that this probability is not. already at its maximum value. 

We now present the axioms. The first group of axioms deals with the 
conditioning. of sampled patterns, the second group with the sampling of 


patterns, and the third group with responses. 
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Conditioning Axioms 


Cl. Oo every trial each pattern is conditioned to exactly one response. 





ce. If a pattern is sampled on a trial, it becomes conditioned with i 
probability c to the response (if any) that is reinforced on the 


trial; if it is already conditioned to that response, it remains so. 


C3. If no reinforcement occurs on a trial (i.e., Eo eceurs), there 


ch, Patterns that are not sampled on a trial do not change their condi- 


tioning on that trial. 


C5. Zhe probability c that a sampled pattern will be conditioned to 


preceding events. wot 
Sampling Axioms 
_Sl. Exactly one pattern is sampled on each trial. 


$2. Given the set of N patterns available for sampling on a trial, the 


probability of sampling a given pattern is 1/N , independently of 


Rl.. Oa any trial that response is made to which the sampled pattern is 


conditioned. 


Later in this section we apply these axioms to a two-choice learning 





experiment and to a paired-comparison study. First, however, we shall 


prove several general theorems. Before we can begin our analysis it is 





necessary to define the notion of a conditioning state. For the axioms 


above, all patterns are sampléd with equal probability, and it suffices 


A. and E. -51- 


io let the state of conditioning indicate the number of patterns condi- 
tioned to each response. Hence for r responses the conditioning states 
are the ordered r-tuples < ky ’ K pees kK > where kj = 0, Ly Byesey J 
and ky + Ky tone + kK. = N ; the integer ky denotes the number of 
patterns conditioned to the A, response. The number of possible condi- 
Bet fi sy | ...(In a generalized model which permitted 


different patterns to have different likelihoods of peing sampled, it 


tioning states is | 


would be necessary to specify not only the number of patterns conditioned 
to a response but also the sampling probabilities associated with the 
patterns. ) 

For simplicity, in this section we limit consideration to the case 
of two alternatives except for one example where r= 3. Given only 
two alternatives we denote the conditioning state on trial n of an 


experiment as Cy a where i= 0, 1, 2,..., N53 the subscript i. indi- 
e. 


eates the number of patterns conditioned to A, and N-i the number 


d 
conditioned to A, . 


Transition Probabilities. Only one pattern is sampled per trial; 
therefore, the subject can go from state Cc, to only one of the three 


states C . 9 or C 


4-1? Cy on any given trial. The probabilities of 


itl 


these transitions depend on the value of the conditioning parameter c¢ , 


the reinforcement schedule, and the value of i. We now proceed to 
compute enege probabilities. 

If the subject is in state Cc, on trial n and an EL occurs, 
then the possible outcomes are indicated by the tree in Figure 5. 


Insert Fig. 5 about here 
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i,nt+l 





Cc. 
i,n 
Coad ntl 
Ce nal 
Fig. 5. Branching process for N element model on a trial when’ the 


subject starts in state Cy and an Ey event occurs, 
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Qn the upper main branch, which has probability = » a pattern that is 


conditioned to A, is sampled and, since an BE reinforcement occurs, 


the pattern remains conditioned to AL - Hence, the conditioning state ! 


on trial n+ 1 is the same as on trial n (see Axiom C2). the 


N-i 


lower main branch, which has probability i 


>» &@ pattern conditioned to 
Ap is sampled; then with probability c the pattern is conditioned to 
AL and theasubject moves to conditioning state C,. +1? whereas with 


probability 1-c conditioning is not effective and the subject remains 


in state C, - Putting these results together we obtain 


, - otk | 
PHCi 1 nat !Eijn Cin) = OW 
(22a) 
Pr(C [Be Oe ehh a oe "Y 
i,ntl'”1,n “i,n N i 
Similarly, if an Ey occurs on trial n, 
e See 
Pr(Cs oy nite in Sayn) = oN 
(22b) 
Pe(C jz ¢, )elcrcoRt, 
i,ntl'2,n “i,n N 
By Axiom C4, if an Ey occurs then 
Pe(Cs war lEon Cyn) BA (22c) 





Noting that a transition upward can occur only when a pattern condi- 


tioned to A, is sampled on an HE, trial, and a transition downward can 


1 
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occur only when a pattern conditioned to A, is sampled on an Bp trial, 


1 


we can combine the results from Eq. @2a-c to obtain 


" eo 
PECs net! Cy ny) = 0 SPB IAS yet! (23a) 
x a fy tt 

Pe(Cs a neal Cg yn) = OR nlAL nn Can) (23b) 
7 ea i. 

PECs nar! oy) =let+e LF alan Cyn) 


Ms T(E, ale a Cin (23c) 
+E(By Ic 7) 


for the probabilities of one-step transitions between states. Equation 23a, 
for example, states that the probability of moving from the state with i 
elements conditioned to AL to the state with i +1 elements conditioned 
to A: is the product of the probability aot that an element not already 
conditioned to AL is. sampled and the probability ccPr(E) Ay a Cy a) 
that, under the given circumstances, conditioning occurs. 

As defined earlier, we have a Markov process in the conditioning states 
if.the probability of a transition from any state to any other state 


depends at most on the state existing on the trial preceding the transi- 


tion. By inspection of Eq. 23 we see that the Markov condition may be 





probability of a reinforcing event Ey. 


depends at most on, the response 


of the given trial; that is, in learning-theory terminology, .to 
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noncontingent and.simple contingent schedules. This restriction will: be 
assumed throughout the present section except for a’ few remarks dn which” 
we explicitly consider various lines of generalization. 


‘With these restrictions in mind, we define 


x,, =Pr(E, _{A, 
Ld Jog 15 


) 


where j=O to r, i=l to ry, and .. Tyrie That is, the 
reinforcement on a trial depends at most on the response of the given 
trials further, the reinforcement probabilities do not depend on the trial 


number. We may then rewrite Eq. 23 as follows: 


Nei 

Wy aed 7 OW "en (2ha) 
Mes i 3 

die Re ea ee oc. 7g uRNE) 
i . 

degen Poy Ms | (240) 


Note that we use the notation q., in place of PC, ) 2: The ~ 
ig P 


fy ynti xn 
reason is that the.transition probabilities do not depend on n, given 
the restrictions. on:the. reinforcement.schedule stated above, and the 
simpler notation expresses this fact. 

Response Probabilities and Moments. By Axioms Sl, S@, and Rl we 


know that the relation between response probability: and the conditioning 


state is.simply 


Pr(A 


Lyn Cs ad 
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Hence 


N 
Pr(Ay ny) = 2-PAy Cy IPCC) 
(25) 
Ny | 
7 2. WCC, i) 


But note that by definition of the transition probabilities a; 3 


Pr(C, ) =P(c 


nyned dag tT FPA Cy ong 


on-1) Io, tPr(C 


. (26) 
e 2a PNCs nat) O53 ; 


The latter expression, together with Eq. 25, serves as the basis for a 


general recursion in Pr(A, |) : 
1,2 
N 1 N 
Pr(Ay .,) = 2 Nv 25 P(5,n1)%n : 


Now substituting for 454 in terms of Eq. 24 and rearranging the sum we 
ay 


have 
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N, N42 
Pr(Ay ) = = 7 Pr(C; 4) > chy 2 2 Pr(Cy 1) 


N-1 
Tee 
~ fy, 2 AED mee 
= N 


ree 


+ 


N-1 
Ch, a itl vio Pr(C 
i=0 N 


“4 nat) 


N 
+ Cm >_ A(4-2) Pr(c. 
i=l Nn 


‘i yn-1) 


The first sum is, by Eq. 25, Pr{A ). Let us define 
N 42 
% on 7 2 : 2 Pr(C; 3 then the second sum is simply -cry5 > nel . 


Ljn-L 


Similarly the third sum is -cx,, [Pr(a, ya) - Pr(Cy 1) - Oy ned 
+ Pr(Cy 31)| = ren, [ Pr(A) 1) - % nat | _ and so forth. Carrying out 
the summation and simplifying we obtain the following recursion in 


Pr(A, ): 


in 


Pr(A, ,) = [2 < 7 (n> + *>,)| Pr(Ay 41) + 7 Moy + (27) 


‘This difference equation has the well-known solution (cf. Bush and 


Mosteller, 1955; Estes, 1959b; Estes and Suppes, 1959) 


n-1 


Pr(Ay 2) = Pray op) -[Pr(A, go) > Pr(4ya)] [2 - fue t ey) 28) 
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where 


1, 
mA. 9) ee 
1,00 Moy + Mo 


At this point it will also: be instructive to calculate the vari- 


ance of the distribution of response. probabilities Pr(A, gles a) ° 
1 3 
The second raw moment as defined above is 
N 42 N 42 N 
.% = SRC, )=>" SD PC, dag + (29) 
Bn. aeG wn” $5 we gap dened 


Carrying out the summation.as was. done in the case of Eq. 27 we obtain 


2c 
oon B,n-1 [ es (ty + 5) 


eyo 
1 


alr 


: 1 
PAL a) [ers 2 + Cig ( 


Subtracting the square of BAA, ) as given in Eq. 28 from % on 
yields the variance of the response probabilities. The second and 
higher moments of the response probabilities are of experimental inter- 
est primarily because they enter into predictions concerning various 
sequential statistics. We return to this point later. 

Asymptotic Disbributions, The pattern model has one particularly 
advantagebus: feature not shared by many other learning models that 


have appeared in the literature. This feature is a simple calcula-~ 


tional. procedure for generating the complete asymptotic distribution 
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of conditioning states and therefore the asymptotic distribution of 
responses. . The derivation to be given assumes that all elements 


q. of the transition matrix are nonzero; the 


dea ? Tana ? Saar 

same technique can be applied if there are zero entries, except, of 

course, that in forming ratios one must keep the zeros out of. ‘the 

denominators. 

As in Sec. 2.3, we let 1 lim Pr(C, _) =u, . The theorem to be 
n-—0oo 738 > 


proved is that all of the asymptotic conditioning state probabilities 


u, can be expressed recursively in terms of Uy 5 since the u,"s 


i 
must sum to unity, this recursion suffices to determine the entire 
distribution. 


‘By Eq. 26 we note that 


Ug = Ug Agg * ¥y Ay 
and hence 
20. 8 SG 
~ ‘ 


We now prove by induction that a similar relation holds for any adjacent 


pair of states; that is 


u, q. , 
io Jitl,i 





ec RC oe 


For any state i , we have by Eq. 26, 


u u q + 


. =u, * pe oP UL das « u, * * 
Ee i-1 Wedd 1*d,i 0 itl Wad 


-Rearranging, 


uy (l-ay 4) = Yy07 Gera T Yaa Giea 
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“However, under the inductive hypothesis we may replace yey by 


its equivalent uy Qian / nay « Hence 


Ma Uh and T-1,4 | te“ 
Ht de. Ud, a 


ul 


ay Cledy g) 


: + 
bo ce i os eR 


or 
u, (1 - 4s yd 45 422) = Us Gara ° 


However ter Qi 4. 7 toa = Ayaan Smee a te * a > 


and therefore 


ie ee cre! 
Veet Ua 


which. concludes the proof. 
Thus, we may write 


bag NGL si gs te te et 
. ? . aa a ? 
a Sug? %, + %, Yo ° 
and so forth. Since the u,'s must sum to unity, Yo 
mined. To illustrate the application of this technique we. consider 


also is deter- 
some simple cases, For the noncontingent case discussed’ in Sec, 2.3 
Tau ~ 


Mo = Too . 
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By Eq. 24 we have 


Nei 
ee a 
“ee 
Wy 4-3 = OF (l-«x) » 


Applying the technique of the previous paragraph 


and in general 


Ys | (W- K+ 1)x 

Ue “kL - « . 
This result has two interesting features. First, we note that the 
asymptotic probabilities are independent of the conditioning para- 
meter c. Second, the ratio of uy, to u,, isthe same as that 


k-1 
of neighboring terms 


. ea (1 = x)F©. ana 3] Sl Gs Nee 


in the expansion of [nx + (1 - x)1 -. Therefore, the asymptotic 


probabilities in this case are binomially distributed. For a 
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population of subjects whose learning is described by the model, 
the limiting proportion of subjects having all N patterns condi- 


tioned to AL is Pa 3 the proportion having all but one of the i 


N patterns conditioned to A 


, is why ~ «) 5 and so on. 


‘For the case of simple contingent reinforcement, 


u (W-k+1)a,,¢ kx e (N= K+ 1)x) 


12 
Med Hy N krtyp 





Again we note that the u, are independent of c. Further the ratio 


i 


to u ‘is the same as that of 


iad kel 


it] mm %p, to (acer) 5 a : 


Therefore the asymptotic state probabilities are the terms in the 


expansion. of \ 


Explicit formulas for state probabilities are useful primarily as 
intermediary expressions in the derivation of other quantities, as will 
be seen below. In the special case of the pattern model(unlike other 
types of stimulus. sampling models) the strict determination of the 
response on any trial by the conditioning state of the trial sample 
permits a relatively direct empirical interpretation, for the moments 


of the distribution of state probabilities are identical with the moments 
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of the response random variable. Thus, in the simple contingent case 
we have immediately for the mean and variance of the response random 


variable | An 


N-k 





k 
; N ; 1 1 : wt, 
a4) = # (E(B ] [Be — 
k=L a. * "2! \"a1* "12 "or * “2. 
and 
: kk a 
NH {2 mt a . 
KON 21 12 2 
var(A) =S + [ee | eee - [E(A_)] 
n tel rr (*) Toy + To Toy + Hip n 
a wee fe 
(Roy # Hy) 
21 * “1. 


A bit of caution is needed in saiydnil had "iwet expression to data. 
If we select some fixed trial n (sufficiently large so that the at 
learning process may be assumed asymptotic), then the theoretical var-. 
iance for the Ay response totals of a number of independent samples 


%o1 “ye 


D by the familiar 


of. K subjects on trial n is simply K 
: : (x54 + Tip) 


theorem for the variance of a sum of independent random variables. 
However, this expression does not hold for the variance ‘of AL pause 
totals over a block of K successive trials. The additional consid= 
erations involved in the latter case will be discussed in the next 


section. 





i 
} 
| 
| 
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3.2 Treatment of the Simple Noncontingent Case 

In this section we shall consider various Seal ond that may be 
derived from the pattern model for simple predictive behavior in a two 
choice situation with noncontangent reinforcement. Each trial in the 
reference dubainueie bebins with presentation of a ready signal; the 
subject's task is to respond ee signal by operating one of a pair 


of. response keys, A, or Ay », indicating his prediction as to which of 


1 
two reinforcing lights will appear. The reinforcing lights are pro- 
grammed by the experimenter to occur in random sequence, exactly one on Is 
each trial, with probabilities which are constant throughout the series 
and independent of the subject's behavior. 
For illustrative purposes, we shall use data from two experiments 
of this sort. In one of these, henceforth designated the .6 series, 
thirty subjects were run, each for a series of 240 trials, with proba- 
bilities of .6 and .4 for the two reinforcing lights. Details of the 
experimental procedure, and a more complete analysis of the data than 
we shall undertake here, are given by Suppes and Atkinson (1960, Ch. 10). 
In the other experiment, henceforth designated the .8 series, eighty 
subjects were run, each.for a series of 288 trials, with probabilities 
of .8 and .2 for the two reinforcing lights. Details of the procedure 
and results have been penowiea* hy Friedman et, al.,(1960) . A possibly 
important difference between the conditions of the two experiments is 
that in the .6 series the subjects were new to this type of experiment 
whereas in the .8 series the subjects were highly practiced, having had i 


experience with a variety of noncontingent schedules in two previous 





experimental sessions. 
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For our present purposes it will suffice to consider only the 
simplest possible interpretation of the experimental situation in 
terms of the pattern model. Let O. denote the more frequently 


occurring reinforcing light and 0 


> the less frequent light. We then 


postulate a one-to-one correspondence between the appearance of light 
0; and the reinforcing event EL which is associated with A, (the 
response of predicting 0; ) . Also we assume that the experimental 
conditions determine a set of N distinct stimulus patterns, exactly 
one of which is present at the onset of any given trial. Since, in 
experiments of the sort under consideration, the experimenter usually 
presents the same ready signal at the peginning of every trial, one 
might assume that N would netescantiy. equal unity. Howeues, we shall 
not impose this restriction on the model. Rather, we shall let Ww 
appear as.a free parameter in theoretical expressions; then we shail 
seek to determine from the data what value of N is required to mini- 
mize the disparities between theoretical and observed values. 

If the data of a particular experiment yield an estimate of N 
greater than unity, and if with this estimate the model provides a 
satisfactory account of the empirical relationships in question, we 
shall conclude that the learning process proceeds as described by the 
model but that, regardless of the experimenter's intention, the subjects 
are sampling a population of stimulus patterns. The een ceececi ee 
at the onset of a given trial might comprise the experimenter's ready 
signal together with stimulus traces (perhaps verbally mediated) of the 


reinforcing events and responses of one or more preceding trials. 


j 
ie 
\ 
i 
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It will be cae that the pattern model could scarcely be 
expected to provide a completely adequate account of the data of two~- 
choice experiments run under the conditions sketched above. Firstly, 
if the stimulus petletia te which the subject responds jamie cues 
from preceding events, then it is extremely unlikely that all of the 
available patterns would have equal sampling probabilities as assumed 
in the model. Secondly, the different patterns must have component 
cues in common and these would be expected to yield transfer effects 
(at least on early trials) so that the response to a pattern first 
sampled on trial n would be influenced by conditioning that occurred 
when components of that pattern were present on earlier trials. How- 
ever, the pattern model assumes that all of the patterns available for 
sampling are distinct in the sense that reinforcement of a response to 
one pattern has no effect on response probabilities associated with 
other patterns. 

Despite these ooupireatians, many investigators (e.g., Suppes and 
Atkinson, 1960; Estes, 19610; Suppes and Ginsberg, 1962b; Bower, 1961) 
have found it a useful strategy to apply the pattern model in the simple 
form presented in the preceding section. The goal in these applications 
is not the perhaps impossible one of accounting for every detail of the 
experimental results, but rather the more modest, yet eal veasiey one 
of obtaining valuable information about various theoretical assumptions 
by comparing manageably simple models that embody different combinations 
of assumptions. This procedure will be illustrated in the remainder of 


the section. 
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Sequential Predictions. We begin our application of the pattern 
model with a discussion of sequential statistics. It shound be empha~ 
sized that one of the major contributions of mathematical learning 
theory has been to provide a framework within which the sequential 
aspects of learning can be scrutinized. Prior to the development of 
mathematical models, relatively little attention was paid to trial by 
trial phenomena; at the present time, for many experimental problems, 
such phenomena are viewed as the most interesting aspect of the data. 

Although we consider only the noncontingent case, the same methods 
may be used to obtain results for more general reinforcement pehedilesy 
We sidLL develop the proofs in terms of two responses but the results 
hold for any number of alternatives. “Ie sens are r responses ine 
given ese inebay application, any one response can be denoted AL 
and the rest regarded as members of a single class, Ag . 


We consider first the probability of an A response given that 


i 
it occurred and was reinforced on the preceding trial; i.e., 

Pr(Ay a lBy Aan) » It is convenient to deal first with the joint 
probability Pr(Ay nei Ayn) , then to conditionalize later. First 


we note that 


Pr(Ay n41®) Ain) * - Pe(A nei 5, n41"1, o4t, nt, n) 2 (30) 
’ : : 
a i 
‘ | 
and that Pr(Ay 415 ,n4271, nt, “4, n) may be expressed in terms of 


conditional probabilities as 
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' Pr(A )Pr(c. 


Js net !Ey, yA ni,n ¢,, n)Pr(B, 


rn! 5, n41¥1, nt, nan nly nan) 


» Pr(A yer(C, ) - 


ral y n 


But from the sampling and response axioms the probability of a response 
on trial n is determined solely by the conditioning state on trial 
n 3 i.e., the first factor in the expansion can be rewritten simply as 


Pr(A [c i er Further, by Axiom Rl, we have Pr(A 


yo" 
1,ntL! "5, nt1 Anta! 3 ne) N 


For the noncontingent case the probability of an E, om any trial is 


1 


independent of previous events and consequently we may write 


Pr(E Jen. 


tnd nCa yn 


Next, we note that 


1 .,af tay 


Pr(c, )= 


‘D5 nn lE 1 PASE 
Oo -,if ifj . 


That is, an element conditioned to Ay is sampled on trial n (since 
an AL response occurs on n) and thus by Axiom C2°no change in the 
conditioning state can occur. 

Putting these results together and substituting in Eq. 30 we ; ; | 


obtain 
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2 
a 
Pr(Ay 418) atin) = * 2 e Pr( Cy na Ey ny San) PPCCS 
2 
aa) Pr(C, ,) 
iow 
Wn? (31a) 


and 


Pr(Ay lB, 


1,n 


Ayn Ata! He E) nA 1, chee be mt “a a 


“In order to express, this conditional probability. in terms of the 
parameters n,c, N, and Pr(Ay 4) » We simply substitute into Eq. 31b 
the expression given for Pr(A, ,) in Eq. 28 and the corresponding 
expression for Ba » that would be given by the solution of the 
difference equation, Kq.. 29. Unfortunately, the expression so obtained 
is extremely cumbersome to work with. Consequently it is usually 
preferable in working with data to proceed in a different way. 

Suppose the data to be treated consist. of proportions of occurrences 
of the various trigrams A E, uA over blocks of M -trials. 


k,;n+lj,ni,yn 
If, for example, M=5 , .then in the protocol. 


“ Trial 1 2 3 4 5 


Event ca A,B, AjE, A,E, A)E5 
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there are four opportunities for. such trigrams, :.The ‘combination 

AL nt® 1 aAt yn occurs on two of these, 4g | nt ft on one, 

and Ay nt ®t, 82, n on one3: hence the proportions of occurrence of 
these trigrams are .5 , .25, and .25, respectively. To deal 
theoretically with quantities such as these, we need only average both 
sides of Eq. 3la (and the corresponding expressions for other trigrams) 


over the appropriate block of trials, obtaining, e.g., for the block 


running from trial n through trial n+M-1 


nt+M-1 n+M-1 


Pu = pa Pr(Ay a8 tra) > it >— %,n! = 1 O5(n,M) (32a) 


where d(n,M) is the average value of the second moment of the response 
probabilities over the given trial block. By strictly analogous methods, 
we can derive theoretical expressions for other trigram proportions, e.g., 


n+M-1 


ji i Pr(Ay 141 atten) = ae 4 (2M) 


4 
+ 
zlo 


Piyo Sto] (32b) 


, BAMeL 
Play ba M >a Pr(A, | nt" 2; n! Ay n! 1) = (1-n)|@(0,m9 = £a,(n,a)), (32e) 


nt+M-1 


Ho Pla 1jn'4 22, n'42,n a 


ii} 
W 


Prop (1-0) (a0 - 32,0) » (324) 


and so on; the quantity a, (nM) denoting the average A, probability 


1 


(or, equivalently, the proportion of A 


1 responses) over the given 


trial block. 
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Now the average moments a; can be treated as parameters to be 


estimated from the data in order to mediate theoretical predictions. To 





illustrate, let us consider a sample of data from the .8 series. Over 
the first 12 trials.of the «= .8 series, the observed proportion of 
AD /responses for the group of 80 subjects was. .63 and the observed 
values for the.trigrams of Eq. 3@a-d were pjj, = °379, Phyo = 168, 





Pio = -061, and Pyop = 055". Using p,,, to estimate Oy (1,12) ‘ 


we have from Eq. 32a 


379 = 8 G(2,12)| ; 





"which. yields as our estimate | 


a 
Mp (1,12) = 47. 


Now we are in a position to predict the value of. Pino *' Substituting 





the appropriate parameter values into Eq. 32d, we have 
Pype = -2(.63 - 47) = .052 , 


which is not far from the observed value of .035. Proceeding similarly, 


we can use Eq. 32b to estimate - >» Viz. 








Pig © 168 = .8 [ca - $)(.63) +E - aur, 
: from which 
6 
i 0135 


With this estimate in hand, together with ‘those already obtained for 
the first and second moments, we can substitute into Eq..32ece and predict 


the value of Pye, 3 
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Pypy = -2 [-47 - -135(.63)] 
-O77 5 


U. 


which is somewhat high in relation to the observed value of .061. 

It. should be mentioned that the simple estimation method used above 
for illustrative purposes would be replaced, in a serious application 
of the model, by a more systematic procedure. For example, one might 
simultaneously estimate ar, and . by least squares, employing all 
eight of the Pa 5k 3 this procedure would yiela a better overall. fit of 
the theoretical and observed values. : 

A limitation of the method just described is that it permits esti- 
mation of the ratio = » but not estimation’ of c and N_ separately. 
Fortunately, in the asymptotic case, the expressions for the moments 
a, are simple enough so that expressions for the ihierens in terms of 
the parameters are manageable; and it turns out to be easy to evaluate 
the conditioning Gaetan and the number of elements from these expres- 
sions. The limit of On for large n is, of course, a in the 


simple noncontingent case. The limit, Os of a, n may be obtained 
? 
from the solution of Eq. 29; however, a simpler method of obtaining 


the same result is to note that, by definition, 


a = 3s 3 


i, 
where u; again represents the asymptotic probability of the state in 


which i elements are conditioned to AL + Recalling that the u, 
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are terms of the binomial distribution, we may then write 


a = 5 (8) too" 


iT} 


33 me i? ea ni(.-x)Nt 


Tic summation ds tue second raw moment of the binomial distribution 


with praranster « and sample size N . Therefore 


[ (1-2) + Pee] Ae 


n& 
uw 


Using Eq. 33 and the fact that lim Pr(A, n) = « we have 
ae f > 


1 


% N 


; . d. 
lim Pr(A 1,1, n) = (1 - §) + 


| 
he aaa) ‘L,nt+l 


By identical methods one can establish that 


: ; 1 ¢ 
Lim Pr(Ay nil, ahs n) -x(l-g) +R, 
i 


lim Pr(A = x(l - ii 


1,n41!22, nAd n) 


and 


F 1 
lim Pr(Ay 41 /Bo Ae n) = x(l - 5 ).. 


(33) 


~ (Bha) 


(34) 


(3he) 


(344) 
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With these formulas in hand, we need only apply elementary 
probability theory to obtain expressions for dependencies of responses 


On responses or responses on reinforcements, viz., 


lim Pr(Ay yi lAy ay) =n+t (iee)(t-n) (35a) 
lim Pr(Ay naalAs a) =n- Gece (35) 
* 4am Pr(Ay vy /B, ,) = (2 - & )x+e (35¢) 
lim Pr(Ay 4a !Bo, a) = (1 -§ ae. (35a) 


Given .a set. of trigram proportions from the asymptotic data of a 
two-choice experiment, we are now in a position to achieve a rigorous 
test of the model by using part of the data to estimate the parameters 

“¢ and N » and then substituting these estimates into Eq. 3ha-d and 
35a-d to predict the values of all eight of these sequential statistics. 
We shall illustrate this procedure with the data of the 6 shea. The 


|B,_A, _) for the last 100 


observed transition frequencies P(Ay aa jyn'k,n 


trials, aggregated over subjects, are as follows: 


Paes 


AE, 


*B 


ALE) 


AaB, 
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An estimate of the asymptotic probability of an A response given 


a 


an AE, event on the preceding trial can be obtained by dividing 


the first entry in row one by the sum of the row; i.e., 








pr(A,[E,A,) = 748/78 +298) = .715 . But, 4f we turn to Ba. 3h 





1 1 F 
141/21, At») =n(l-%) +9 - Hence, letting 


i715 -= .6(1 - *) + = , we obtain an estimate!of “N= 3.46. Similarly 


we note that lim Pr({A 


ce 


is For any one subject, * N must, of course, be an integer. The fact 
that our estimation procedures generally yield non~integral values for 
N may signify that N varies somewhat between subjects, or it may 


simply reflect some contamination of the data by ‘sources of experimental’ 


error not represented in the model. 











Pr(A, |B, Ap) 7 yee, /(62 + 306) = .602 which by Eq. 34b is an estimate . 


of x(l - =) + ee using our values of x and N we find. that i 174 


and ¢ = .605°. 


Having estimated c and N we may now generate predictions for 








any of our asymptotic quantities. Table 3 presents predicted and 
observed values for the eencpeies given in Eq. 34a to Eq. 35d. Consid- 
ering that only two degrees of freedom have been utilized in estimating 
parameters, the close correspondence between theoretical and observed 
quantities in Table 3 may be interpreted as giving considerable support 


to the assumptions of the model. A similar analysis of the asymptotic 


Insert Table 3 about here 
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Table 3 


Predicted (pattern model) and observed values of sequential statistics 


for final 100 trials of the .6 series. 








Asymptotic 
Quantity Predicted Observed 
» Pr(A, |E,A)) “715 715 
Pr(A, E,A,) aL +935 
Pr(A, EB Ay) -601 601 
Pr(A, |E>Ap) 1428 413 
Pr(A, |A,) 645 Oud 
Pr(A,[A,) 552 552 
Pr(A, |B) .669 667 


Pr(A, |B5) 496 489 
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data from the .8 series, which has peen reported elsewhere (Estes, 1961b; 
Estes and Suppes, 1962), yields comparable agreement between theoretical 
and observed trigram proportions. The estimate of c/N for the .8 
data is very close to that for the .6 data (.172 vs .174), but the 
estimates of c and N(.31 and 1.84, respectively) are both smaller 
for the .8 data. It appears that the more highly practiced subjects of 
the .8 series are, on the average, eer from a smaller population 
of stimulus patterns and at the same time are less responsive to the 
reinforcing lights than the more naive subjects of the .6 series. 

Since no model can be expected to give a perfect account of fallible 
data arising from real experiments (as distinguished from the idealized 
experdients to which the model should apply strictly), it is difficult 

‘to know how to evaluate the goodness-of-fit of theoretical to observed 
values. Im practice, investigators usually proceed on a largely 
intuitive basis, evaluating the fit in a given instance against that 
which it appears reasonable to hope for in the Light of what is 
known about the precision of experimental control and measurement. 
Statistical tests of goodness-of-fit are sometimes possible (discussions 
of some tests which may be used in conjunction with stimulus sampling 
models are given by Suppes and Atkinson, 1960, and by Estes and Suppes, 
1962); however, statistical tests are not entirely satisfactory 
taken by themselves, for a sufficiently precise test will often indi- 
cate significant differences between theoretical and observed values 
even in cases where the agreement is as close as could reasonably be 


hoped for. Generally, once a degree of descriptive accuracy has. been 
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attained which appears satisfactory to investigators familiar with the 
given area, further progress must come largely via differential tests of 
alternative models. 

In the case of the two-choice noncontingent situation, the ingre- 
dients for one such test are immediately at hand; for we developed in 
Sec. 2.3 a one-element, guessing state model that is comparable to the 
N-element model with respect to the number of free parameters, and which 
to many might seem equally plausible on psychological grounds. These 
models both embody the all-or-none assumption concerning the formation 
of learned associations, but they differ in the means by which they 
escape the deterministic. features of the simple one-element model. It 
will be recalled that the one-element model cannot handle the sequential 
statistics considered in this section because it requires, for example, 
a probability of unity for response A, on any trial following a trial 
on which Ay occurred and was reinforced. In the N-element model (with 
N> 2),. there is no such constraint, for the stimulus pattern present 
on the preceding reinforced trial may be replaced by another pattern, 
possibly conditioned to a different response, on the following trial. 
in the guessing state model, there is no strict determinacy since the 
A, response may. occur on the reinforced trial by guessing, if the 


subject “iss in state C, 


‘oO 3 and, if the reinforcement was not effective, 


a different response i may’) occur, again through guessing, .on the 
following trial. 
The case of the guessing state model with c= O(c, it will be 


recalled, being the counter-conditioning parameter) provides:a two 


























A. and E. -77- 


parameter model which may be compared with the two-parameter, Nrelement 
model. We will require an expression for at least one of the trigram 
proportions that have ‘been studied above in connection with the N-element 
model. Let us take Pr(Ay 4181, nA n) for this purpose. In Sec. 2.3 


we obtained an expression for Pr(A ) for the case with 


L, ntl [E, nn 


ec = 0 and thus we can write at once 


i Ww Ww 1 
Pry nat AL n) 7 {4 = a % ale pes ) 3} ee 


Since we are interested only in the asymptotic case, we shall drop the 
n subscript from the right hand side of Eq. 36a and have for the desired 


theoretical, asymptotic expression 


Pi = x[uy + (i +e") ra ‘ (36b) 


Substituting now into Eq. 36b the expressions for wy and Ug derived 


in Sec. 2.3, we obtain finally 


a [Use + (1-n)e(1-¢e")] : (36c) 


Pu © *. kin” + (1a) + n(l-n)e] 


‘To apply this model to the asymptotic data of the .6 series, we may 
first evaluate the parameter ¢ by setting the observed proportion of 
AL responses over the terminal 100 trials, .593, equal to the right 


hand side of Eq. 21 and solving for ¢ , viz, 








A. and BE. ~78- 


: a[x + (l+x) 5] 
1229.2 SB Be ee 
: x + (Len) + n(L-n)e 


_ .6(.6 + .2€) 


052 + ee 3? 


and 
6€=2.315 . 
Now introducing this value for ¢ into Eq. 36c, and simplifying, we 


obtain the prediction 


Pry = .2782 + .O775 c" . 
Since the observed value of Pray for the .6 data is .249, it is apparent 
that no matter what value (in the admissible range O<c"<1) is 
chosen for the parameter ig" > the value predicted from the guessing 
state model. will be too large. Further analysis, using the methods 
illustrated above, makes it clear that for no combination of parameter 
estimates can the guessing state model achieve predictive accuracy 
comparable to that demonstrated for the N element model in Table 3. 
Although this one comparison cannot be ududtaoned decisive, one might 
be inclined to suspect that, for interpretation of two-choice, proba- 
bility learning, the notion of a re-aecessible guessing state is on the 
wrong track, whereas the N-element sampling model merits further 
investigation. 

Mean and variance of AL response proportion. By Letting | 


My = %o, = 8 in Eq. 28, we have immediately an expression for the 























i 
| 
| 
i 
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probability of an AL response on trial n ‘in the noncontingent case, 
viz. 


n-1 
) 


Pr(Ay ) => Ge Pr(Ay ,)1QL - p - (37) 


If we define a response random variable An which equals 1 or 0 
according as AL or Ag » respectively, occurs on trial n , then the 
right side of Eq. 37 also represents the expectation of this random 
variable on trial n. The expected number of Ay responses in a 


series of K trials is, then, given by the summation. of Eq. 37 over 


trials, 
= K N ek 
E(A,) = 2 BlA,) 2Ke- [x - Pr(A, IL - (2 -g) 2-38) 
: n= a 


Tn’ experimental applications, one is frequently interested in the learning 


curve obtained by plotting the proportion of A, responses per K-trial 


1 
block. A theoretical expression for this learning function is readily 
obtained by an extension of the method used to derive Eq. 38. Let x 

be the ordinal number of a K-trial block running from trial K(x-1) + 1 


to Kx where x =1,.2,..., and define P(x) as the proportion of 


Ay responses in block x. -Then’ 


K(x-1 


i a 
lL 
P(x) = 2 P(A) "25 Pr(A, ,) 


i] 


| 


os f - Pr(Ay | f -(L- | (1 - grt) . (39a) 
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‘The value. of Pr(A, 1) should be in the neighborhood of. .5 if response 
> 
bias does not exist. However, to allow’ for sampling deviations we 


may. eliminate Pr(A. 


1 yp in favor of the observed value of P(1) . This 
2 . : ' 


can be done in the following. way. Note that 


ro) an Efe my Jb 0-8) - 


Solving for [nx - Pr(A, vl and substituting the result in Eq. 39a, 
oe F : ; : : 


we obtain 


P(x) = x - fx = POLE - SY (590) 


Applications of Eq. .59b to data have led to results that are. 
satisfying in soak respects but peepicdae in others (see, mks, Estes, 
.1959a). In most instances the implication that the learning curve should 
have x or an asymptote has been borne out (Estes, 1961b,1962),..and . 
further, with a suitable choice of. values for c/N , the curve represented 
by Eq. 39b has served to describe the course of learning. However,, in. 
experiments run with naive subjects, as has been nearly always the case, 
the value, of c/N required to fit the mean learning curve has been 
substantially smaller than the value required to handle the sequential 
statistics discussed in the preceding section. Consider, for example, 
the learning curve pan the ke sapide plotted by 20 trial blocks. The 
observed value of P(1) is .48 and the value of c/N estimated from 
the sequential statistics of the second 20-trial block is .12. With 


these parameter values, iq. 39b yields a prediction of -59 for P(3) 
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and. the theoretical curve is essentially at asymptote from block 4 on. 
The empirical learning curve, however, does not approach .59 until block 
6 and is still short of asymptote at the.end of 12 blocks, the mean 


proportion of A, responses over the last five blocks being. .593 


1 
(Suppes and Atkinson, 1960, p. 197). 

In the case of the .8 series there is a similar disparity between 
the value of .c/N estimated from the sequential statistics and the value 
estimated from the mean learning. curve. As we have noted above, an 
optimal account of the trigram proportions Pr(Ay nat 85, Ad sn) requires 
a.c/N value of approximately .17. But if this estimate is substituted 


into Eq. 39a, the predicted A; frequency in the first block of l2 


1. 
trials is. .67, compared to an observed value of .63, and the theoretical 
curye runs appreciably above the empirical curve for another five. blocks. 
A c/N value of .06 yields a satisfactory. graduation of the observed 
mean curve in.terms of Eq. 39a, and a fit to the trigrams that does not 
look bad by usual standards for prediction in learning experiments. 
However, comparing predictions based on the two c/N estimates for the 
trigrams which contain this parameter, we see that the estimate of .17 
‘is. distinctly superior. For the trigrams averaged over the first 12 


trials, the result is 


Observed Theoretical; c/N = .17 Theoretical; c/N = .06 


Py22 -168 177 Uy 
Py21 061 O73 +087 
Poyp 121 119 152 


Ppp . 062 «053 .0B9 
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The reason for this discrepancy ‘in the value of c/N required. to 
give optimal descriptions of two different aspects of the data is not 
clear even after mich investigation. One contributing factor might be 
individual differences in learning ae (c/N values) among subjects; 
these would be expected to affect the two types of statistics differently. 
However, -in the case of the .8 series, when a more homogeneous. subgroup : 


of subjects (the middle 50% on total A 


1 frequency) is analyzed, the 


disparity, although somewhat reduced, is not eliminated; optimal e/ N 
paiee for the mean. curve and the trigram statistics are now «08 and 15, 
respectively. The principal source of the remaining discrepancy in this 
homogeneous subgroup is a much smaller increment in AL frequency. from 
the first to the second 12-trial block than was predicted. Over the 
‘first three blocks the observed proportions were 633, .665 and .7903 
the proportions predicted from Eq. 39a with c/N = .15 run .657, «779 ’ 
and .800. A possible explanation is that in the early part of the ‘series 
the subjects are responding to cues, .perhaps verbal in character, which 
are discarded (i.e., are not resampled) when they fail to elicit consis-. 
tently correct: responding. An interpretation of ‘this sort. could be 
incorporated into the model and subjected to formal testing, but this 
has not yet been done. In any event, one can see that analyses of data 
in terms of a model enabies us. to determine precisely which aspects of 
the subjects' behavior are and which are not accounted.for in terms of 
a particular set of assumptions. | 

Next to the mean learning curve, the most frequently used behavioral 


measure in learning experiments is perhaps the variance of response 
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occurrences in a block of trials. . Predicting this variance from a 
theoretical model is an exceedingly taxing assignments for the effects . 
of individual differences ‘in learning rate, together with those of all 
sources of experimental error not represented ‘in the model, must be 
expected to increase the observed response variance. However, this 
statistic is relatively easy to compute for the pattern model, and the 
deviation may serve as a prototype for deviations of similar expressions 
in other learning models: For simplicity, -we shall limit consideration 
here to the case of the: variance of Ay response frequency in a trial 
block after the mean curve has reached asymptote. 

As a preliminary to computation of the variance, we require a 
statistic which is also of interest in its own right, the covariance of 
Ay iedaases on any two trials; that is (using the notation of Eq. 2-5), 

Cov( sadn? - F(A, +n) a (Ay) B(A,) 
; (40) 


Pray neiAt sn) = Pr(Ay nie)PrlAy py) : 


First, we can establish by induction that 


. oe ke1 
Pr(Ay aid sn) = xPr(A |.) - [Pe(4,0) - P(t oats a) (1 - rd s 


This formula is obviously an identity for k=1. Thus, assuming that 
the formula holds. for trials n and. n+ k , we may proceed to establish 


it for trials n and n+k+1. First we use our standard procedure 
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to expand the desired quantity in terms of reinforcing events and states 
of coriditioning. ~ Letting c, 7 denote ‘the state in which exactly j° 
3 


of the’ N elements ‘are conditioned to response Aly we may write’ 


Pel) aiciAy a) = 25 Pray ieee nek 5, nett; n) 


ru >a Pr(A, ynvict By nae ’z, nate 1; n)Pr(By ni’ 5, nti”, “)° 


i,j 


Now we can make use of the assumptions. that specify. the noncontingent 


case to simplify. the second factor to nPr(C, ). : ana 


wnt Lyn 
(1-a)Pr(C, nc 1, n for i-=.1, 2, - respectively. Also, we. may.apply 


the learning axioms to the.first. factor, obtaining 


Pr(A |z d= ‘3* (1+ pi Ged + ana 


Cc 
Q-p ata 


LyntletL Lenk’ § atk 1 yn 


and 
Pr(Ay ar nk 3 yntk“1, n p= he aE : x 


Combining these results, we have 
= ~£y)d48 = -£y2 
Pr(Ay iceAr yn) = 2 [e y) Nt ‘| + (len)(1 =). pete, nti, n) 


SU DEH BP oy ak, gM OE 


dees, 


- qPr(a y+ ie Pr(Ay als 


intl, n 
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Substituting into this expression in terms of our inductive hypothesis 


yields 


c 
Pr(Ay actin, n) =(l- a) sPr(A, |) - [PCa ~ Pelt naay, 2) 


e Bo c 
- (i - WD: tag Pr(A, ,,) 


: ‘ k 
mPr(Ay 4) [Gy 2) . (4, nats) (1 - P i 


as required. 
We wish to take the limit of the right side of Eq. 40 asn—-7c. in 


order to obtain the covariance of the response random variable on any two 


1, ntk 
: a , ! 2 (i-c) 
‘to be equal to xn , and from Eq. 35, we have the expression a + x(1-x T 


trials at asymptote. The limits of Pr(A, n) and Pr(A ) we know 
: eae 


l,ntl1,n 


for the limit of. Pr(A. A, _) . Making the appropriate substitutions 
in Eq. 40, yields the simple. résult - . 


: k-1 
. : arr: 2 2 (1-c¢) c tie 
Him Covet) a « aon + a (1-9) T Jo - ey <4 


(41) 
= aden) (1-9) (1 - cl 


N 


» Now we are’ready to compute var(A,.) » the variance of Ay response 
frequencies in a ‘block of K trials at asymptote, by applying the stan- 


dard theorem for ,the variance of a sum of random variables (Feller, 1957):.. 


: : K 
var(A,.) = lim {x Var (A). 2 a = cov bah} : 
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; 2 
lim E(AT) =n + 1+ (len) + O=n, 
n> o© 


the limiting variance of Bn” is simply 


lim ‘Var(A.) » "lim R(A7) ~- aim BCA )P ew 
an any) mT) 
n- oo n->© n- oo 


Substituting this result and that for lim Cov(A A.) into the general 


expression for var(A,) >» we obtain 


j K sie 
var(A,) Ku(l-x)}.+ 2 > De aG-) (1-2) (4 - ne 


i=l j=2 


M 


Kx(1-).+ oabton (re) ae: [poe ve) 
Jat 


Ku(l-1) + Bu(ien)(-e) rane [0 -9)"| 


“e 


Application of..this formula can be conveniently illustrated in terms 
of the asymptotic data for the .8 series. Least squares determinations 
of 7 and N fromthe trigram proportions (using Eq. 34a~d) yielded 
estimates of .17 and 1.84, respectively (Estes and Suppes, 1962). 
Inserting these values into Eq. 42; we obtain for’a 48 trial block ae. 
‘asymptote, var(A,) = 37.50; this variance corresponds to a'standard 
deviation of 6.12. The observed standard deviation for the final 48 
trial block was 6.94. Thus, the theory predicts a variance of the right 
order of magnitude, but, as anticipated, ‘underestimates the observed 


value. 
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Of the many other statistics that can be derived from the N-element 


model for two-choice learning data, we shall give one final example, 
selected primarily for the purpose. of reviewing the technique for 
deriving sequential statistics. This technique is so generally useful 
that the major steps should be emphasized: first, expand the desired 
expression in terms of the conditioning states (as done, for example, 
in the case of Eq. 30); second, conditionalize responses and rein- 
forcing events on the preceding sequence of events, introducing what- 


ever simplifications are permitted by the boundary conditions of the 


ease under consideration; third, apply the axioms and simplify to 

obtain the appropriate result. These steps will now be followed in 
deriving an expression of considerable interest in its own right, -- 
the probability of an AL response following a sequence of exactly 


Vv Ey reinforcing events: 
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[E Pr 


Pr(A (Ay atveL, ntv-1"" tees hey) 


Ly abv. 


cook. ey 
Lyntv-1 Loe,n-L x’ (1-n) 


oe 
x’(1-n) Sd 


Pr(Ay neva never, never’? Er, n82,n-a°j, nev) 


- eo: ve 
- Pr(A, le. +B, B,C, 2) 
1 Cea) tj SL, WV, atv 1, neve Ln 2,n-1-j,n-1 


Pr (Cy el By never Ea nBe nay ,n-1) 


* Pr(E tee 


‘LyntVv-1 1 ne, net!Oy, nerd PP(C5 ner? 


Px(e 


Ci nev naver EL Be yn-1 og nea) Pr Ce 


i 
¥ 'jpaen) 


[o-ebf-a.- da - 2} sod po - 42) - a aed 


Pe 
. 
hs 


Me 


j=O 
_s j oY oi wk ey’ 
= 1-(1 = at = y) 2S. N qt J DD Pr(c, nel) 
5-0 : 
e1-(-p)a-9)"-Sp 0-2)" 
= [va a 2p, [a - a . (43) 


The derivation has a formidable appearance, mainly because we have 
spelled out the steps in more than customary detail, but each step can 


readily be justified. The first involves simply using the definition of 


a conditional probability, Pr(A|B)= s , together with the fact that 
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in the simple noncontingent case, Pr(B, re) = 1 ena Pr( Ey ) = a ~ « 


for all. n, and Pr(B eB wv = 1 Ya. a The 


Lyn", never’ "By, ne, n- 


second step apt rOauaee the conditioning states Ch ney and c, nel ? 
denoting the states in which i elements are conditioned to AD on 
trial n.+vV and 3 elements on trial» n- 1, respectively. Their 
insertion into the right-hand expression of line 1 is permissible since 
the summation of Pr(C,) over all values of i is unity and similarly 
for the summation of Pr(C,) « The third step is based solely on 
repeated application of the defining equation for a conditional 
probability, which permits the expansion bs 
Pr(ABC....J) = Pr(A|BC....J3)Pr(BIC....J)..-Pr(J) . The fourth step 


_ involves assumptions of the model: the conditionalization of AL nv 
. 3. 


pai 
i,ntv! oN 


since, according to the theory, the preceding history affects response 


on the preceding sequence can be reduced to Pr(A Cc 


1, nil 
probability on a given trial only insofar as it determines the state 
of conditioning, i.e., the proportion of elements conditioned to the 


given response. The decomposition of Pr(Ey ney 2h, n®2, nei /¢ ‘jo n-1) 





into « Ven)Pr(c, is justified by the special assumptions of the 


j,n- -1) 
simple noncontingent case. The fifth step involves calculating, for 


each value of j on trial n- 1, the expected proportion of elements 


conditioned to Ay on trial n+v. There are two main. branches to 


the process starting with state c. on trial n-1.. In one, which 





by the axioms has probability l1-c i , the state of conditioning is 


unchanged by the E, event on trial n- 1 $3 then, applying Eq. 357 





with «x-=1 (since from trial n onward we are dealing with a sequence 
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: A Vv 
a i i -(Q- 2)(1- £ 
of E,s) and Pr(A, 4) =» we obtain the expression fu (1 (1 YD 
for the expected proportion of elements connected to AL on trial n+vV 


in this branch. In the other branch, which has probability c¢ . appli- 


cation of Eq. 37 with «=1 and Pr(A, ,) = = yields the expression 
> 
j- v 
p-0 - J) - ay | for the expected proportion of elements connected 
to Ay on trial n+V. Carrying out the summation over j , and using 


the by now familiar property of the model that 


a Pr(Cc, ) = Pr(Ay 


j,;n-1 el? = Poa? 


N 
j=0 
we finally arrive at the desired expression for probability of AL 
following exactly v E,'s 
Application of Eq. 43 can conveniently be illustrated in terms of 
the .8 series... Using the estimate of .17 for . (obtained previously 
from the trigram statistics) and taking Py = .83 (the mean proportion 


of A, responses over the last 96 trials of the .8 series), we can 


compute the following values for the conditional response proportions: 


v 0 1 2 3 4 
Theoretical -689 «7k2 «786 822 .852 
Observed -695 - 787 -838 .859 .897 


It can be seen that the trend of the theoretical values represents quite 
well the trend of the observed proportions over the last 96 trials. 


Somewhat surprisingly, the observed proportions run slightly above 
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the predicted values. There is no indication here of the “negative 


recency effect" (decrease in Ay proportion with increasing length of 
the Ey sequence) reported in a number of published two-choice studies 
(e.g., Jarvik, 1951; Nicks, 1959). It may be significant that no nega- 
tive recency effect is observed in the .8 series, which, it will be 
recalled, involved. well-practiced. subjects who had had experience with a 
wide range of 1x values in preceding series. However, the effect is 


Observed in the .6 series, conducted with subjects new to this type of 


experiment (cf. Suppes. and Atkinson 1960, pp. 212-213). This differential 





result appears to support the idea (Estes, 1962) that the negative recency | 
phenomenon. is attributable to guessing habits carried over from everyday 
life to the experimental situation and.extinguished during a Longstraining 
series conducted with noncontingent reinforcement. 

We shall conclude our analysis of the N-element pattern model by 
proving a oa general "matching theorem.." iieecpattee of this theorem 


is that, so long as either an EL or an § 


5 reinforcing event occurs 


on each trial, the proportion of Ay responses for any individual subject 





should tend to match the proportion of EL events over a sufficiently 
long series of trials regardless of the reinforcement schedule. 

For purposes of this derivation, we shall identify by a subscript 
x the probabilities and events associated with the individual x ina 


population of subjects;. thus will denote probability of an A 


Prin 1 


response by subject .x on trial n., and Ban and. Ayn will denote 


random variables which take on the values 1 or O according as an Ey 
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event and an Ay response do or do not- occur in this subject's protocol 


on trial n. With this notation, the probability of an A, response 


ae 


by subject x on trial: n +1 can be expressed by the recursion 


c 
Poo ntl oz Pyiyn uy Wan “Ara n) (44) 


The genesis of Eq: 44 should be reasonably chvious if we recall that 


Pyar on is equal to the proportion of elements currently conditioned to 
$3 


the Ay response. ‘This proportion can change only if an EL event 


occurs on a trial when a stimulus pattern conditioned to Ay is sampled, 


‘in which case Boy ~ rn =l-0O=1, or if an E> event occurs on 


a trial when a pattern conditioned to A, .is sampled, in which. case 


1 


E = O-1+5-+1. In the former case, the proportion of pat- 


x1,n “Sean 


terns caiditioned to AL increases by < if conditioning is effective | 


(which has probability ¢) and in the latter case this proportion 
decreases by < (again with probability cc). 
Considering now a series of, say, n* trials: we can convert Bq. 44 


into an analogous recursion for response proportions over the series 


simply by summing both sides over n and dividing by n*, viz. 


es Keke eo 
n* 2 Pot yntl = n¥ 2. Poin * ak 2 Baya” Aan) . 


Now we subtract the. first sum on the right from both sides of the 


equation, and distribute the second sum on the right yielding 





PR ~~ 
xl,n+l *xi,1 le Lie 
n* = ae Ny Baya” ae Aan 
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The limit of the left side of this last equation is ‘obviously zero as 


a 


n*¥ + co 3 thus, taking the limit and rearranging, we.have 





q Equation 45 holds only if the two limits exist, which will be the case 


if the reinforcing event on trial n depends at most on the outcomes .of 
some finite number of preceding trials. When this restriction is not 


satisfied, a substantially equivalent theorem can be derived simply by 
: : : : mae 
dividing both sides of the equation immediately preceding by ae 2a Extn 


before passing to the limit; that is 


n* 


Agyn 


M 


Potyntl 7 Pxtyt _ 


BP 


ic ns 
N n* 


n* 
2 Aayn ; 2+ Aan 








la 


Except for special cases in which the sum in the denominators converges, 


the limit of the left-hand. side is zero and 


n* 
2 Aan 


lin 2 = 1. 
nt > 0 Sp 
mi elem 
ees) i ae 
lim => A, = lim = E . (45) 
n*¥ 00 ne n=1 "ala n* — co ne net *t)R ; 
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To appreciate the strength of this prediction one should note that 
it, holds for the data of an individual subject starting at any arbi- 
trarily selected point in.a learning series, seaiiaed only that a suffi- 
eiently long block of trials following that point is available for 
analysis. Further, it holds regardless of the values of the parameters 
N. and c (provided that the latter is not zero) and regardless of the 
way in which the schedule of reinforcement may depend on preceding events, 
the trial number, the subject's behavior, or even events outside the 
system (e.g., the behavior of another individual in a competitive or 
cooperative social situation). Examples of empirical applications of 
.this theorem under a variety of reinforcement schedules are to be found 


in studies reported by Estes, 1957a and Friedman, et. al., 1960. a4 


3.3 Analysis of a Paired Comparison Learning Experiment 

In order to exhibit a somewhat different interpretation of the axioms 
“of Sec. 3.1, we shall now analyze an experiment involving a paired- 
comparison procedure. The experimental situation consists of a sequence 
of discrete trials. There are r objects, denoted A,Gi=2 tor). On 
each trial two (or more) of these objects are presented to the subject 2 
and he is required to choose between them. Once his response has been 
made the trial terminates with the subject winning or losing a fixed 
amount of money. The subject's task is to win as frequently as possible. 
There are many aspects of the situation that can be manipulated by the 
experimenter; for example, the strategy by which the experimenter makes 


available certain subsets of objects from which the subjects must choose, 
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the. schedule by which the experimenter determines whether the selection 
of a given object leads to a win or loss, and the amount of money won or 
lost on each trial. 

The particular experiment for which we shall essay a..theoretical 
analysis was reported by Suppes and Atkinson (1960, Ch. 11). The problem 
for the aubjeote involved repeated choices from subsets of a set of 


three objects, which may be denoted A > Ap > and A On each trial 


3° 
one of the following subsets of objects was presented: (A,Ap) a (AAs) ie 
(ApA3) > Or (A ADA) . The subject selected one of the objects in. the 


presentation set; then the trial terminated with a win or.a loss of a 


small sum of money. The four presentation sets (Ay) 3 (A,As) ; 

(ApAs) and (A, ApAs) occurred with equal probabilities over the series 

of trials. Further, if object A, was selected on a trial then with 
probability 2% + the subject lost and with probability 1 - Ay he won 

the predesignated amount. More complex schedules of reinforcement could 
be used; of particular interest is a schedule where the likelihood of a 
win following the selection of a given object depends on the other 
available objects in the presentation group. For example, the probability 


of a win following an A, choice could differ depending on whether the 


1L.. 
(A,Ao) . (A,Az) or (A, AAs) presentation group occurred. The analysis 
of these. more complex schedules does not. introduce new mathematical 
problems. and.may be pursued by the same methods we shall use for the 


simpler. case. 


Before the axioms of Sec. 5.1 can be applied to the present experi- 


ment we need to provide an interpretation of the stimulus situation 
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confronting the subject from trial to trial. The one. we select is. some- 
what arbitrary and in Part 4 alternative. interpretations are examined. 
Of course, discrepancies between predicted and observed quantities will 
indicate ways in which our particular analysis of the. stimulus needs to 
be modified. 

We shall represent the stimulus display associated with ius presen- 
tation of the pair of objects (ajA,) by a set Ss of stimulus 
patterns of size N 3; the triple of objects (A, AAs) will be represented 


by.a set of stimulus patterns § of size N* . Thus, there are four 


123 : 
sets of stimulus patterns, and we assume that the sets are pairwise 
disjoint (i.e., have no patterns in common). Since, in.the model under 
consideration, the stimulus element sampled on any trial represents 


the full pattern of stimulation effective on the trial, one might wonder 


why a given combination of objects, say (A,Ap) » should have more than 





one element associated with it. It might be remarked in this connection i 
that in.introducing a parameter N to represent set size, we do not 
necessarily assume N> 1. We simply allow for the possibility that 
such variations in the situation or different orders of presentation of 
the same set of objects on different trials might give rise to different _ 
stimulus patterns. The assumption that the stimulus patterns associated 
with a given presentation set are pairwise disjoint does not seem appeal~ 
ing on common sense grounds; nevertheless it. is of interest to see how 
far we can go in predicting the data of a paired-comparison learning 


experiment with the simplified model incorporating this highly restrictive 


\ 
fh 
| 
[ 
i 
| 
‘ 


assumption. Even though. we. cannot attempt to handle the positive and 
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negative transfer effects that must occur between different members of - 

the set of patterns associated with,a given combination of objects during 

learning, we may hope to account for statistics of asymptotic data. : 
When the pair of objects (A,A;) is presented the subject must 


select A, or A (i.e., make response A, or A,)3 hence all pattern 


J 


elements in Si become ‘conditioned to Ay or A, » Similarly all 


elements in 6 became conditioned to Ay 9 A, or A 


a 3° When (A,A,) 


123 
is presented the subject samples a single pattern from 545 and makes 
the response to which the pattern is conditioned. 

The final step, before applying the axioms of Sec. 3.1, is to 
provide an interpretation of reinforcing events. Our analysis is as 
follows: If (A,A,) is presented and the A, . object is selected,- then 
(a) the E, reinforcing event occurs if the Ay response is followed 
by a win and (b} the g, event occurs if the A, response is followed 
by .a loss. If (A,AjA,) is. presented and the A, object is selected, 
then (a) the Ey event occurs if the Ay response is followed by a win 


and {b) B, or E, occurs, the two events having.equal probabilities 


k 


if the Ay response is followed by a loss. This collection of rules 
represents only one way of relating the observable trial outcomes to the 


hypothetical reinforcing events. For example, when Ay is selected 


given (A,A Ai) and followed. by a loss, rather than having E F or E. 


occur with equal likelihoods, one might postulate that they occur with 


probabilities dependent on the ratio of wins following A, responses to 





wins following AL responses over previous trials. Many such variations 





in the rules -of .correspondence hetween trial outcomes and. reinforcing 
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events have been explored; these variations become particularly important 
when the experimenter manipulates the amount. of money won or lost, the 
magnitude of reward in animal studies, and related variables (see Estes, 
1960p; Atkinson, 1962b;and Suppes and Atkinson, 1960, Chapter 11, for 
discussions of this point). 


In analyzing the model we shall use the following notation: 


Aad) occurrence of an Ay response on the nth presentation 

- 4 
of (A,A5) [note that the reference is not to the nth 
trial of the experiment but to the nth presentation of 
(A,4;)1 3 

wits) = a win on the nth presentation of (A,A;) : 

(43) _ 
Ly = a loss on the nth presentation of (A;A;) . 


We now proceed to derive the probability of an A, response on the 
nth presentation of (A,A;) 3 namely pr(altd)) . First we note that the 
2 


state of conditioning of a stimulus pattern can change only when it is 


sampled. Since all of the sets of stimulus patterns are pairwise disjoint 


the sequence of trials on which (A,A5) is presented forms a learning 


process that may be studied independently of what happens on other 


trials (see Axiom C4); that is, the interspersing of other types of trials 


between the nth and n + st 


presentation of (A,A5) has no effect on 
the conditioning of patterns in set 8,5 . 
We now want to obtain a recursive expression for pr(a(td)) » This 
> 


ean be done by using the same methods employed in the preceding section. 


But to illustrate another approach we proceed differently in this case. 





i 
| 
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» (add (ag) 
hlddy_ pod = a . 
Let Pray j= y, and Pr{asn ysl Vy Then the possible 


changes in y, are given in Figure 6. With probability 1-¢ no 
Insert Figure 6 about here 


change occurs in conditioning regardless of trial events and hence 


Ya = Tn 3 with probability c¢ change can occur. If Ay oceurs and 


as followed by a win then the sampled element remains conditioned to 


A, 3 however, if a loss occurs the sampled element (which was conditioned 


shee F ah _ 
to A,) becomes conditioned to A, and thus Ynul =n"? If A, 


occurs and is followed by a win then Yaa = %y 3 however, if it is 


followed by a loss the sampled element (which was conditioned to A,) 


: : 1 
becomes ede tiees to Ay » hence Vary i’ Putting these 
results together we have: 
Ynua = Yyll.- oc) + y, fey, - a ey, - =) (oy nt) 


+ yell - x1 -a,)1 + (x, + §)Le(L = ¥,)5] 


which simplifies to the expression 


Ynal = y [1 = Rly +1 +E, - €h6) 


Solving this difference equation, we obtain 


neL 


d d 
(19), __ _ "3 i (13) c F ¢ 
Pel) ea eee? pr(altd) ~ ylAs + ,) > (47) 


aa J i J 
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Fig. 6. Branching process for a diad probability on a paired 
comparison learning trial. 





ce fh A 


A. and E. -100- 


We now consider Pr(A ee as 3 for simplicity let a, = pe(alZ9)), 


2 (123) 3 (123) , 
B= Pr(As a ) and 1. a ~ B= Pras nn ye The possible changes in 


a. are given in Figure 7. For example, on the bottom branch conditioning 


Insert Figure 7 about here 


is effective and an A, response occurs which leads to a loss; hence 


3 
EY or Ey occur with equal probabilities. But an Ay followed by E) 
1 F ae apy 
makes Coad = a, ah We while Ay followed by Ey makes ad = Oy BN 


Combining the results in this figure yields the following difference 


equation: 


G4, = (1 - cla, + a.fow. (1 - a4)] Ha, - ed (00,4 ] 


#,[6B,(1 - X5)] + fa, + = J1cB dp 21 +8 do 4] 


L 
i a,[e(1 - a, - B))(1 - Wz) +o, +4, Ife(i- a = B. Yr. 5] 
+a [e(l-@ +6 )x z 
n n ne 32 . 
Simplifying this result we obtain 
a. =a |i - 5 (2,4 0,) +B. ose 04-2.) toed (48a) 
n+l n eu* 1 3 n en* "2 3 en* “5 
By a similar argument we obtain 
- c Paco 
Pasa = Pa] - Bye (2k9 +A da, aye (Oy ~ 3) + oye dy + (48D) 
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Fig. 7. Branching process for a triad probability on 
paired comparison learning trial. 
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Solutions for the pair of difference equations given ‘by Eq: 48a and -48b 

are well known and can. be. obtained by a number of different ‘techniques 
pe 130-153; i 

(see Goldberg, 1958,/or Jordan, 1950). Any solution presented can be — 

verified by substituting into the appropriate difference equations. 

However, for now we shall limit consideration to asymptotic results. 


In terms of the Markov chain property of our process it can be shown 


that. the limits @= -lim a and B = lim Pp exist. Letting 
n> n—> 00 


Oy = Hee and Boal = Bb. =B in Eq. 48a and 48b we obtain 


aa + rz) = BOX, - dz) + ds 


B(2d, + dz) = oT - dz) + dz 


Solving for @ and B , and rewriting we have 





; heh 
* 123) Q"5 
lim pr(al je (49a) 
ln Aho + Ay hs + hohe ? 
: (123) Aids 
lim Pr( )= > s (490) 
: ‘ayn Ae Pps TN | 
and 
had 
(123), _ ase 
He Neo) ta A EE cd ne OE 


The other moments of the distribution of. response probabilities can 


be obtained following the methods employed in Sec. 3.1; and, atu, 
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asymptote we can generate the entire distribution. In particular, for 
set 85 the asymptotic probability that .k patterns are conditioned 


to Ay and N-k to A, is simply 


k N-k 
N r. de 
k]|A, +2, Ag td 
5) a 


For the set S103 the asymptotic probability of ky patterns conditioned 


to Al» & to Ay » and k,, to A 


3 (where k, + ky + Ky = N*) is 


N*! 1 a er 
kt fat a + IA FGA (Xe nice. 3) Ie)? 
1 Ke 3 2 13 23 
In analyzing data, it also is helpful to examine the marginal 
limiting probability of an A, response, Pr(A,) , in addition to the 
other quantities mentioned above. We define Pr(A, ) as the probability 


of .an A, response on any trial (regardless of the stimulus display) once [ 


the process has reached asymptote. Theoretically 


Pr(A, ) = pr(aehex( 07) + pr(al?3)ypr( (29) + pr(al®?) ype (pi?) 
Pr(Ay) = pr(al®) ype(o? dy 4 pr(al?2hpn( D629) + pr(ag'e>? ypr( pil23)) 
and - 





Pr(Az) = 1 - Pr(A,) - Pr(A) , 
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where pr(pt4d)) is the probability of presenting the pair of objects 
(A,A ;) ° 

The experimental results we consider were reported in preliminary 
form.in Suppes and Atkinson, {1960}. Two groups were run each involving 
48 subjects; subjects in one group won or lost one cent on each trial, 
and those in the other group won or lost five cents on each trial. We 
shall consider only the one-cent group, for an analysis of the differen- 
tial effects of the two reward values requires a more elaborate inter- 
pretation of reinforcing events. Subjects were run for 400 trials with 


the following reinforcement schedule 
Ap = BR a dy = 6/10 , dg = 8/10 . 


Figure 8 presents the observed proportions of Ay 7 Ap and As 
Insert Figure 8 about here 


responses in successive 20-trial bliocks. The three curves appear to 
be very stable over the last 10 or so blocks; consequently we treat the 
data over trials 301 to 400 as asymptotic. 

By Eq.. 47 and Eq. 49a-c we may generate predictions for pr(a(td)) 


(2 


and Prta; =) ) . Given these values and the fact that the four presen- 
ae 4 


tation sets occur with equal probabilities we may, as shown above, generate 


) . The predicted values for these quantities 


predictions for Pr(A, 
b,c 


and the observed proportions over the last 100 trials are presented in 


Table 4. The correspondence between predicted and observed values is 


OBSERVED PROPORTIONS 








2 3 4 5 6 7 8 9 10.11 3.12 13 14 15 16 17 #18 #19 «20 
BLOCKS OF 20 TRIALS 
Fig. 8. Observed proportion of A, responses in successive 20-trial blocks for paired 


comparison experiment. 
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Insert Table 4 about here 


very good, particularly for Pr(Ay oo) and pr(al?d)) . The largest 
@iscrepancy is for the triple presentation set, where we note that the 
observed value of pr(al223)) is .O4l above the predicted value of .507. 
The statistical problem of determining whether or not this particular 
difference is significant is a complex matter and we do not undertake 

it here. However, it should be noted that similar discrepancies have 
been found in other studies dealing with three or more responses (see 
Gardner, 1957; Detamibel, 1955) and it may be necessary, in subsequent 
developments ae the thabey, to consider some reinterpretation of rein- 
foreing events in. the multiple response case. 

‘In order ee make predictions for more complex aspects of the data 
it is necessary to obtain estimates of ec, N and WN* . Estimation 
procedures of the sort referred to in Sec. 4.2 are applicable but the 
analysis becomes tedious and such details are not appropriate here. 
However, some comparisons can.be made between sequential statistics 
that do not depend on parameter values. For example, certain nonparametric 
comparisons can be made between statistics where each individually 
depends on c and WN, but where the difference is independent of these 
parameters. Such comparisons are particularly helpful when they permit 
us to discriminate among different models without introducing the com- 


plicating factor of having to estimate parameters. 
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Table 4 


Theoretical and observed asymptotic choice proportions 





for paired-comparison learning experiment. 


Predicted Observed 
Pr(A,) 64 73 
Pr(As) oo 302 129 
Pr(Az) 234 +235 
pr(a, 1) LOLS 651, 
pr(A, 9) .706 .700 : 
pr(A,(29)) S71 561 
pr(a, (229)) 507 548 
pr(A,(?3)) cae 258 


Pr(ag(1?9)) all 19h 
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To indicate the types of comparisons that are possible, we may 
consider the subsequence of trials on which (A,A5) is presented and, 


_in particular, the expression 


(12) p4(12),(12),,(12) 4 
Prat a AL Maad ag)) 
That is, the probability of an AY response on the n+18t presentation 
s th - 
of (Ajo) given that on the nth presentation of (A,Aa) an A, 
occurred and was followed by a win, and that on the n-15t presentation 
of (A)Ay) an Ay occurred followed by a win. To compute this proba- 


bility we note that 


(12) ) (42), (12), (12) ,( 


MM De EEE) 
Lynl Alyn eS a pr(WO2) ge) C2) 
2 


Pr (Ay 


Now our problem is to compute the two quantities on the right-hand side 


of this equation. We first observe that 


12) ,(12),(12),(12), (12 
Pray ean Ata Mn AS) 


12) ,(12) ,(12),(12),(12) (12 2 
= Fr vocal of teat a8 gf), 
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where oft) denotes. the conditioning state for set S10 in which i 


elements are conditioned to AL and N-i to Ay on the nth presen- 
tation of (AjAp) - Conditionalizing and applying the axioms,we may 


expand the last expression. into 


ae 12) 12 12) .(12)..(12) (12) (12) 
Se ela erat (Saal AL Wad Ag ahmed) 
(12), (12) (12) 


end = ag Pr Ae) |W AS n-1°d,ne1) (2 > Ap) 


+ Pr( a ,|¢ ae ere) 


Further, the sampling and response axioms permit the simplifications 


pr(al? Plea = 4 , 
eral f2222) 002), 2, 


and 


er(ae)[oft@),) = BS 


Finally, in order to carry out the summation, we make use of the relation 
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JP a eee eg 
(a2) ofl2) ) - 


pr(cl2t) [wht?) 4 AP 2)y w2a Rae ati 


0 for. i 4.3 


which expresses the fact that no change in the conditioning state can 
occur if the pattern sampled leads to a win’ (see Axiom C2). Combining 
these results and simplifying we have 
(50a) 
L, nei, n ijn “n-1 


Pr (alt?) ae) eer 2) lk | “Pt jee ee =) -) 


Similarly we obtain 





(50b) 
pr(WlP gE OA) AOE) yo (1 = ay) = Ag) cos 4 bs jer ae ae 
and finally, taking the quotient of the last two expressions, 

5 [5] [Se}prcol2» 
eal?) |x?) (22)4(12),(22) nin (508) 


a n Waa 1 42\n . 
- a i fee eases 


We next consider the same pequenvial statistic but with the aernonee? 


reversed on trials n and n--1 ; namely 
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Pela ae AS Maca AL me) 


| 
H 
i 


Interestingly enough, if we compute 


: 
| 
| 


l,ntl'n Wied Pa: n-1 


p(n) 9, v22)a22), 


and 


(22), (02),(2),(22) 


ae (Wy AD, n Waa a nel 


they turn. out to be expressed by the right sides of Eq. 50a and 50b; 


respectively, Hence, for all n, 


(2) 


rig 


y= ml 


pr( al ie (12) 


(22), (12), (22) Iwé 


i, af 


Comparable predictions, of course, hold for the subsequences of trials 
on which (aa) or (ApAz) are presented. 


Equation 51 provides a test of the theory which does not depend on 





parameter estimates. Further, it is a prediction that differentiates 
between this model and many other models. For example, in the next 
section we consider a certain class of linear models, and it can be 

shown that they generate the same predictions for the quantities in 

Table 4 as the pattern model. eaters the sequential equality displayed 


in Bq. 51 does not hold for the linear model. 
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To check these predictions, we shall utilize the data over all trials 
of the (AjAp) subsequence and not restrict the analysis to asymptotic 


‘performance. Specifically we define 


baa = SE Prt a2?) 0?) gle) y (02) (02) 
n 


Ly na" n yt Wet » 


= So relay a (12) 622) 2) 9 (22) ) 


bia1 1, na, n AS ya nel ieee “1 


ae p(w?) i Was ) 


“= reer yl2),(2) 


n n-1 AT n-1 


But by the results just obtained we have Cie = Saye and boy = S10 
for any given subject. Further, if we define base as the sum of the 


c 





1 "i ; e = , is 
age ® Over all. subjects then it follows that bie1 = $42 independent 
of intersubject. differences in c and N. Similarly S40 = boy 5 
Thus we have a set of predictions which are not only nonparametric but . 





which require no restrictive assumptions on variability between subjects. 


Observed frequencies corresponding to these theoretical quantities are , I 


‘ i 
as follows: os 


140 = 138 


Sia, = Sire ' 


245 


oo = 2k 


Sie 
S127 / $01 = +976 beatin = 4366 
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‘Similarly, for the (A,A5) subsequence 


= 67 = 64 


C131 5 Saas 


bs, = 120; ), = 122 


bi / $51 = 25983 yy 5/b 15 = +525 


Finally, for the (A,Az) subsequence 


boxe = 45 Soa = 9 


82 


tos.™ 87 


Sage/ Sao = -9K9 boos/ bog Pes 


Further analyses will be required to:determine whether the pattern 
model gives an entirely satisfactory interpretation of paired-comparison 
learning. It is already apparent, however, that it may be very diffi- 
cult indeed to find another theory that takes us further in this direc- 


tion. than the pattern model with equally simple machinery. 
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4. A COMPONENT MODEL FOR STIMULUS COMPOUNDING AND GENERALIZATION 


4,1 -Basic Concepts; Conditioning and Response Axioms 

In the preceding section we simplified our analysis of learning in 
terms.of the N-element,; pattern model by assuming that all of the patterns 
dovrelveas ans a given experiment are disjoint, or at any rate that generali- 
zation effects from.one stimulus pattern to another are negligible. . Now 
we shall. go to the other extreme and treat problems of simple transfer 
of training between different stimulus situations that have elements in 
common in a purely cross-sectional manner, with no reference to a learning 
process occurring over trials. Again the basic mathematical apparatus 
will be that of sets and elements, but with a reinterpretation which 
needs to be clearly distinguished from that of the pattern model's In 
Sections 2 and. 3 we regarded the pattern of stimulation effective on any 
trial as a single element sampled from a larger set of such patterns; 
‘now we shall consider the trial pattern as itself constituting anet of 
elemanvay tie elements representing the various components or aspects of 
the stimulus situation which may be sampled by the subject in differing 
septation on different trials. We shall proceed first to give the 
two ‘basic axioms thet establish the dependence of response probability 
on the conditioning state of the stimulus. sample. Then some theorems 
will be derived that specify relationships between response. probabilities 
in overlapping.stimulus samples, and these will be illustrated in. terms 
of applications to experiments on simple stimulus compounding. Consi- 
aeration of the process whereby trial samples are drawn from a larger 


stimulus population will be deferred to Section 4.2. 
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c 


The basic axioms of'the’ component model are as follows: 
Cl. The sample s of stimulation effective on any trial is partitioned 
into subsets 8, (4 =.1, 2,..er, where r is the number of response 


alternatives), the i-th subset containing the elements conditioned 


to (or “connected to") response A, . 


rad 


B 


C2. ‘The probability of response A, in the presence of the stimulus 


sample s is given by 
N(s,) 
s 


Pr(A,[s) = As) 58 


where N(x) denotes the number of elements in the set x. 


in Cl we modify the usual definition of a partition to the extent of 
permitting some of the subsets to be empty; that is, there may be some 
response alternatives which are conditioned to none of the elements of 
s . We do mean to assume, however, that each element of s is condi- 
tioned to exactly one response. . The substance of C2 is, then, to make 
the probability that a given response will be evoked by .s equal to the 


proportion of elements of s that are conditioned to that response. 


4.2 Stimulus Compounding 

An elementary transfer situation arises if one reinforces two 
responses, each in the presence of a different stimulus sample, then 
combines all or part of one sample with all or part of the other to 


form @ new test situation. To begin with a special case, let us consider 
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an experiment conducted in the laboratory of one of the writers 


(wx.g.) 2 


ee 
This experiment was conducted at Indiana University with the assistance 


of Miss Joan SeBreny. 





In one stage of the experiment, a number of disjoint samples of three 
distinct cues drawn from a large population were used as the stimulus 
members of pai red-associate -items, and by the usual method of paired 
presentation one response was reinforced in the presence of eis of 
these samples and a different response in the presence of others. The 
“constituent cues, intended to pciee abttnd capteiea’ ecunberparté of 
stimulus elements, were various Pe es symbols, which for present 
purposes we shall designate by small letters a, b, -c, ete., and the 


" spoken aloud. Instructions 


responses were the numbers "one" and “twa,” 
to the subjects indicated that tle cues represented symptoms and the “ 
numbers diseases with which the symptoms were associated. Following 
the training trials, new combinations of "symptoms" were formed, and the 
subjects were instructed to make their best guesses at the correct 
diagnoses. 

Suppose now that response Ay had been reinforced in the presence 
of the sample (abc) and response Ay in the presence of the sample 
(def). If a test trial were given subsequently with the sample (abd), 
direct application of Axiom..C2 yields.the prediction that response Ay 


should occur with probability 2/3, Similarly, if a test were given 
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with the sample (ade), response A, would be predicted to occur with 


1 
probability 1/3. Results obtained with 40 subjects, each given.2¥ tests: of 
each type, were as follows: 

Percentage ovérlap of training. and test sets -667 2333 

Percentage response 1 to test set 669 332 

Suecess in bringing off a priori predictions of this sort depends 

not only on the basic soundness of the theory but also on one's success 
in realizing various simplifying assumptions in the experimental situa~ 
‘tion. As mentioned above, it was.our intention in designing the experi- 
ment just cited to choose cues, a, b, c, etc., which would take on. the 
role of stimlus elements. Actually, in order to justify our theoretical 
predictions ry it was necessary only thet: tie cues behave as equal-sized 
sets a atensiea: To bring out die dgeceatite of the equal N assump- 
tion, let us suppose that the individual cues. actually correspond to 
sets So By ete., of elements. Then, given the seme training (response 
Ay reinforced to the combination abe and response Ag to def), and 
assuming the training effective in conditioning. all elements of each 
subset to the reinforced response, application of Axiom C2 yields for 


the probability of response AL to abd 


“y 


Nt RM, 
Pr(A, 6.8.84) = yee? 
a b dad 


where we have used the obvious ‘abbreviation N(s;) = WN, . This equation 


i 


reduces to Pr(A,|s,8,8,) = 2/3 only if N, =", 2, - 
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In this experiment we depended on commonsense considerations to 
choose cues which could be expected to satisfy the equal-N requirement, 
and also counterbalanced the design of the experiment so that minor 
deviations might be expected to average out. Sometimes it may not be 
possible to depend on commonsense considerations. In that case, one 
can utilize a preliminary experiment to check on the simplifying assump- 
tions. Suppose, for example, we had been in doubt as to whether cues a 
and b would behave as equal-sized sets. To check on this,--we could 
have run.a preliminary experiment in which we reinforced, say, response 
Ay to a and response Ap to b , then tested with the compound. ab. 
Probability of response AL to ab is, according to the model, given 
by 

N 
Pray isgs,) = ae 
“a b 
which should deviate in the appropriate direction from 1/2 if N, and 
N are not equal. By means of calibration experiments of this sort, 
sets of cues satisfying the equal-N assumption can be assembled for use 
‘in further research involving applications of the model. 

The expressions obtained above for probabilities of response to 
stimulus compounds can readily be generalized with respect both to set 
sizes and level of training. Suppose that a collection of cues ajby,c... 
corresponds to a collection of stimulus sets 8,2 Sp» Bypeet of sizes 
No? Ny Nyseee and that some response A, is conditioned to a propor- 


tion Pas of the elements in 8 & proportion Py; of the elements 





A. and BE. -116- 


in Bp > and so on. Then probability of response A, to a compound of 


these cues is, by Axiom C2, expressed by the relation 


Np.+Np,.+Np..t... 
7 bb 
Prd, lessees) Seg a J___2 ss (52) 


= NU +N tpt . 

Application of Eq. 52 can be illustrated in terms of a study of 
probabilistic discrimination learning reported by Estes, Burke, Atkinson, 
and Frankmann (1957). In this study the individual cues were lights 
which differed from each other only in their positions on a panel. The 
first stage of the experiment consisted in discrimination training 
according to a routine which we shall not describe here except to say 
that on theoretical grounds it was predicted that at the end of training 
the proportion of elements in a sample associated with the i-th light 
conditioned to the first of two alternative responses would be given by 
Pai = 5 - Following this training, the subjects were given compounding 
tests with various triads of lights. Considering, say, the triad of 


2 
lights 1, 2, and 3, the values of Pal should be Pi. = re » Po = 3? 


a Oe , * 8 be s : 
and Pay este assuming Ny = Ny = N, = .N., and substituting these 
values into Eq. 52, we obtain 
Bt a zi 7 2 
ploes S aae s ae  S 
Pr(A,|1,2,3) = 3h 273° 5 


as the predicted probability of response 1 to the compound 1,2,3. -Theo- 


retical values similarly computed for a number of triads are compared with 


the empirical test proportions reported by Estes et. al., in Table 5. 
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Insert Table 5 about here 

An important consideration in applications of models for stimulus 
compounding is the question of whether the experimental situation contains 
an appreciable amount of background stimulation in addition to the controlled 
stimuli manipulated by the experimenter. Suppose, for example, we are 
interested in the problem of whether a compound of two conditioned stimuli, 
say a light and'a tone, each of which has been paired with the same uncon- 
ditioned stimulus, may have a higher probability of evoking a conditioned 
response (CR) than either of the stimuli presented separately. To ana- 
-lyze this problem in terms of the present model, we may represent the 
Light and the tone by stimulus sets 8), and Sn Assuming that assa 
result of the previous reinforcement the proportions of conditioned 
elements in er and Sn (ana therefore the probabilities of CRs to the 
stimuli taken separately) are Py, and Pp » respectively, application of 
Axiom C2 yields for the probability of a CR to the compound of light and 


tone presented together, neglecting any possible background stimulation, 


NPy, + NpPn 


Pr(CR|L,T) = 
N+ 0, 


Clearly, the probability of a CR to the compound is simply a weighted 
mean of Py and Pp > and therefore its value must fall between the 
probabilities of a CR to the two conditioned stimuli taken separately. 


No "summation" effect is predicted. 
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Table 5 


Theoretical and observed proportions of response AL 


to triads of lights in stimulus compounding test. 


Triad Theoretical Observed | 
Dy 2505, . 15 22. - 
4, 5,60 38 31 
Fai Dy : a 38 AL 
7, 8, 9 620 59 
2, 10, 12 — 62 58 


lo, 11, 12 a (85 ake et 
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Often, however, it may be unrealistic to assume background stimula- 
tion from ths. apperetod and surroundings to be negligible. In fact, the 
experimenter may have to count onan appreciable amount of background 
stimulation, predominantly conditioned to behaviors incompatible with 


" 


the CR, to prevent "spontaneous" occurrences of the to~be-conditioned 
response during intervals between: presentations of the experimentally 
‘controlled stimuli. Let us now expand our representation of the condi- 
tioning situation by defining a set 5 of background elements, a propor- 
tion Py of which are conditioned to the OCR. For simplicity, we shall 
consider only the special case of Py = 0. Then the theoretical proba- 
bilities of evocation of the CR by the light, the tone, and the compound 
of light and sound (together with background stimulation in each case) 


are given by 





N, Pp. 
L°L 
al ae ac 
L b 
NgP 
T 
Pr(cR|T) = i, 7%? 
and 
NePm + Np. 
T iL 
Pr(cR|L,@) = a a 
respectively. Under these conditions it is possible to obtain a summa- 


tion effect. Assume, for example, that Np = N, = XM and Pp > Py,» 
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50 Pr(cR[T) > Pr(cR|L) . Taking the difference between the probability | 


of a CR to the compound and probability of a CR to the tone alone, 





we have 


il 


Pr(cr[L,T) - Pr(cr[T) 





which is positive if the inequality 2p, > Pp holds. Thus, in this 
case, probability of a CR to the compound will exceed probability of 
a CR to either conditioned stimulus alone, provided that Pn is not 
more than twice Pye 

The role of background stimuli has been particularly important in 
the interpretation of drive stimuli. It has been assumed (Estes, 1958, 
1961a) that in simple animal learning experiments;(e.g., those involving 
the learning of running or bar-pressing responses with food or water 
reward): the stimulus sample to which the animal responds at any time is 
compounded from several sources-~the experimentally controlled conditioned 
stimulus (CS) or equivalent; stimuli, perhaps largely intra-~organismic 
et origin,. controlled by the level of food or water deprivation; and 
extraneous stimuli which are not systematically correlated with reward 
of hel ecenones undergoing training and therefore remain for the most 


part connected to competing responses. It is assumed further that the 
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sizes of samples of elements associated with the CS and with extraneous 
sources, 5), and Sh» are independent of drive, but that the size of 
the sample of drive-stimulus elements, Bp > increases as a function of 
deprivation. In most simple reward-learning experiments, conditioning 

to the cs and drive cues would proceed concurrently, and one might 
eee that at a given stage of learning the proportions of elements in 
samples from these sources conditioned to bes oie response, R, 
would be equal, i.e., Po = Pp + If this were the case, then probability 
of the rewarded response would be independent of deprivation; for, letting 
D and D' .correspond to levels of deprivation such that Ny< Np: , 

we have as the theoretical probabilities of response R at the two 


' deprivations, 
Nop +N. 
c_* “pPp 
Pr(R/CS,D) = Pe ars Ses 
Cc D 
and 
Np, + Ni,P., 
Pr(R|CS,D') = a . va. Dae 
Gio DY 


If the same training were given at the two drive levels, then we would 
have Pp = Pp» a6 well as Po = Py 3 in this case the difference between 
the two. expressions is zero. Considering the same assumptions but with 
extraneous cues taken explicitly ties account, we arrive 4 a quite 
different picture. In this aah: the two Seite untond for response 


probability are 
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NoPo * NpPp + NePp 
No tM + 


Pr(R[CS, D,E) 


and 


MePo * Np:Bp: .* NpPp 


No + Np: + Np 


Pr(R|CS,D',£) 


Now, letting Po = Pp = Pp =P; and for simplicity taking Py = Oo; 


we obtain for the difference 


Ny # Np, Ny + Ny 


N,+M +N” NWF 0+ 


Pr(R|CS,D',E) - Pr(R[CS,D,E) = 
; COS Dh ERS Gps Sk 


Ni, (0. 17 Ny) 


“PG, +i +R)G, Fe Fu) ? 
Ny * Np: + WA), + Ny +H, 


which is obviously greater than zero given the assumption Np: > Ny 4 
Thus, in this theory, the principal reason why probability of the rewarded 
response tends, other things equal, to be higher at higher deprivations 


is that that the larger the sample of drive stimuli, the more effective 


it is in outweighing the effects of extraneous stimuli. 


4.3 Sampling Axioms and Major Response Theorem of Fixed Sample Size 
Model 
In 4.2 we considered some transfer effects which can be derived 
within a component model by considering only relationships among stimulus 


samples that have had different reinforcement histories. Generally, 
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however, it is desirable. to take account of the fact that there may not 
always be a one-to-one correspondence between the experimental stimulus 
display and the stimulation actually influencing the subject's behavior. 
Owing to a number of factors, des variations in receptor-orienting 
responses, fluctuations in.the environmental situation, variations in 
excitatory states or sidoneiias of receptors, the subject often may 
sample only a portion of the stimulation made available by the experi- 
menter. One of the chief problems of statistical learning theories has 
been to formulate conceptual representations of the stimulus sampling 
process and to develop their implications for learning phenomena. With 
respect to specific mathematical properties of the sampling process, 
‘component models that have appeared in the literature may be classified 
into two main types: (1) models assuming fixed sampling probabilities 
for the individual elements of a stimulus population, in which case 
sample size varies randomly from trial to trial; and (2) models assuming 
a fixed ratio between sample size and population size. “The former type 
was first discussed by Estes and Burke (1953), the latter by Estes (1950); 
and some detailed comparisons of the two types have been presented by 
Estes (1959b). In this section we shall limit consideration to models 
of the second type, since these are in most respects easier to work with. 
In the remainder of this section we shall distinguish stimulus 
populations and samples by using S , with subscripts. as needed, for a 
population,-and s fora dette, The sampling axioms to be utilized 


are as follows 
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Sl. . For any fixed, experimenter-defined stimulating situation, sample 


S82. All samples of the same size have equal probabilities. 
A prerequisite to nearly all applications of the model is a theorem 
relating response probability to the state of conditioning of a stimulus 
population, We shall. derive the theorem in terms of a. stimulus situation 
S containing N elements from which a sample of size MN(s) =o is 
drawn.on each trial. Assuming that some number Ny of the elements of 
S$. are conditioned to response A, » we wish to obtain an expression 
for the expected proportion of elements conditioned to A, in samples — 
drawn from § , since this proportion will, by Axiom.C2, be equal to 
the probability of evocation of response A, by samples from S . We 
begin, as usual, with the probability in which we are interested, then, 
using the axioms of the model as appropriate, oieieed ee expand in terms 


of the state of conditioning and possible stimulus samples: 


Pr(A, |S) = >a Pr(A,|s)Pr(s|8) ” 


The summation. being overall. samples of size o that can be drawn from S » 


Next, substituting expressions for the conditioned probabilities, we obtain 


Ny N-N, 
o. 
pr(A, 1S) -™ 
: Ni 8,) =0 


N(s,.) N(s,) o-N(s,) 
In the last expression on the right, 





"  fe} 
N(s,) 





represents the proba- 


bility of A, in the presence of a sample of size o containing a 
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subset 85 of elements conditioned to. Ay 3 .the product of binomial 
coefficients denotes the number of ways of obtaining exactly N(s,) 
elements conditioned to A, in a sample of size o , so the ratio of 


this product to the number of ways of drawing a sample of size o is 


N(s; ) 
the probability of obtaining the given value of = - ‘The resulting 





formula will be recognized as the familiar sicmeadion for the mean of 
a hypergeometric distribution (Feller, 1957, p. 218), so we have the 
pleasingly simple outcome that the probability of a response to the 

_ stimulating situation represented by a set S is equal to the proportion 


of elements of S that are conditioned to the given response 


= 


} wi 
Pr(A,[8) = 4 - (53) 
This result may seem too intuitively obvious to have needed a proof, but 
one should note that the same theorem does not hold in general for 
component models with fixed sampling probabilities for the elements 


(cf. Estes and Suppes, .1959b). 


4.4 Interpretation of Stimulus Generalization 

Our approach to the problem of stimulus generalization is to 
represent the similarity between two stimuli by the amount of overlap 
between two sets of elements.” 
a model. similar in most essentials has been presented by Bush and 


Mosteller (1951p). 




















| 
i 
| 
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In the simplest experimental paradigm for eshibicme generalization, we 
begin with two stimulus situations, represented by sets 5, and 5, c 
neither of which has any of its elements conditioned to a reference 
response Ay » Training is given by reinforcement of Ay in the presence 
of 8, only until the probability of Ay in that situation reaches 
some value Pal >0O. Then test trials are given in the presence of 
S y» and if Phy now proves to be greater than zero, we say that 
stimulus generalization. has occurred. If the axioms of the component 


model are satisfied, the value of Pp provides, in fact, a measure 


of the overlap of Ss. and 5, 3 for, by Eq. 53, we have immediately 


: M(S,08,)P.y 
5 i pues 709% aa 





where 8.8, denotes the set of elements common to S. and 5, » since 


the numerator of this fraction is simply the number of elements in 5, 


that are now conditioned to response Ay - More generally, if the 
proportion of elements of S, conditioned to A) prior to the experi- 
ment were equal to Spi 2 not necessarily zero, the probability of 


response Ay to stimulus 8, after training in S. would be given by 


: MS,0 5, )p.5 + [xs,) - NS. 8.) Je, 
Pp. = a os a 2 


or with the more compact notation Nap = n(S,N S,) » ete., 


NapPair * (Oh, > MaplSp 
eg abla i ab Sol (sha) 
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This relation can be put in still more convenient form by letting 


20 Ly viz 
al , . 
Ny ab 


Poi 7 YapPar * (1 - Way )8p1 


This equation may. be rearranged to read 


Por Mep!Par ~ Spi) * By > (54>) 


and we see that the difference ) between the post-training 


(Par 7 Sp1 
probability of AL in s. and the pre-training probability. in 8, 
can be regarded as the slope parameter of a linear "gradient" of 
generalization in which Ppp is the dependent. variable and the propor- 
tion of overlap between 8. and 8, is the independent variable. If 
we hold ee constant and let Pal vary as the parameter, we generate 
a family of generalization gradients which have their greatest disparities 
= 1 (i.e., when the test stimulus s 


and converge as the overlap between 8, and 5. decreases, until the 


at w 


‘ab is identical with 8.) 


gradients meet at Poy = Sp when Wop = 0. Thus the family of 


gradients shown in Fig. 9 illustrates the picture to be expected if a 


Insert Fig. 9 about here 


series of generalization tests is given at each of several different 
stages of training in 8? or, alternatively, at several different stages 
of extinction following training in Sa? as was done, for example, by 
Guttman and Kalish (1956). The problem of “calibrating” a physical 


stimulus dimension so as to obtain a series of values which represent 
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1.0 





PROBABILITY OF RESPONSE Ay 
uw 





Sa Sp 
Fig. 9. Generalization from a training stimulus, Sao to a test 


stimulus, 58 at several stages of training. The parameters are 


Db? 
Way = «5, the proportion of overlap between Se and 82 and 
Sp1 = .l, the probability of response AY to 8, prior to training 
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equal differences in the value of Wey has been discussed by 
Carterette (1961). 


One might regard the parameter w 


ay «8S an index of the similarity 


of Ss. to 5, » In general, similarity is not a symmetrical relation, 


for Wap 


the latter by 


N 
is not equal to Me (the former being given by so and 
b 


N 
Ab ) except in the special case N,=N ‘When 


XN. db" 
N, # N, ) pencpay eet iod from training with the larger set toa test 
with the smaller set will be greater than generalization from training 
with the smaller set to a test with the larger set (assuming that the 
reinforcement given the reference response Ay in the presence of the 
training set 5, is such as to establish the same value of Pay in 
each case prior to testing in 8) - We shall give no formal essump- 
tion relating size of a stimulus set to observable properties; however, 
it is reasonable to expect that larger sets will be associated with more 
intense (where the notion of intensity is applicable) or attention-getting 
stimuli. Thus if 5, and S, represent tones a and b of the same 
frequency but with tone a more intense than b , we should predict 
greater generalization if we train the reference response to a given level 
with a and resi anit b than if we train to the same level with b 
and test with a.- 

It is worth noting that, although in the psychological literature 
the notion of stimulus generalization has nearly always been taken to 
refer to generalization along some physical continuum such.as wavelength 
of light, intent of sound, or the like, the set-theoretical model is 


not restricted to such cases. Predictions of generalization in the case 





Hl 
i 
j 
| 
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’ of complex stimuli may be generated by first evaluating the overlap 


parameter wW 


alo for a given pair of situations a and b froma set 


of observations obtained with some particular combination of values of 
Pat and Spy? then computing theoretical values of Py for new 
conditions involving different levels of Pal and Sp «+ The problem 


of treating a simple “stimulus dimension". is of special interest, how- 


ever, and we shall conclude our discussion of generalization by. sketching 


one approach to this problem.+°? 


106 follow,.in most respects, the treatment given by W. K. Estes and 
D. -L. La Berge in unpublished notes prepared for the 1957 SSRC Summer 
Institute in Social Science for College Teachers of Mathematics. For 
an approach combining essentially the same set-theoretical model with 
somewhat different learning assumptions, the reader is referred to 


Restle (1961). 


We shall consider the type of stimulus dimension that Stevens (1957) 


has termed substitutive, or metathetic, i.e., one which involves the 
notion of a simple ordering of stimuli along a dimension without varia- 
tion in intensity or magnitude. Let us denote by Z a physical dimen- 
sion of this sort, e.g., wavelength of visible light, which we wish to 
represent by a sequence of stimulus sets. First we shall briefly out- 
line the properties that we wish this representation to have, then we 


shall.spell out the assumptions of the model more rigorously. 
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It is part of the intuitive basis of a substitutive dimension that 
one moves from point to point by exchanging some of the elements of one 
stimulus for some new ones belonging to the next. Consequently, we shall 
assume that as values of Z change by constant increments, each success- 
ive Seamed set should be generated by deleting some constant number 
of elements from the ae ahetins set and adding the same number of new 
elements to form the next set. But to ensure that the organism's 
behavior can reflect the ordering of stimuli along the. Z scale without 
ambiguity, we need also to assume that once an element is deleted as we 
go along the Z scale, it must not reappear in the set corresponding 
to any higher .Z value. Further, in view of the abundant empirical 
evidence that generalization declines in.an orderly fashion.as the 

‘distance between two stimuli on.such a dimension increases, we must 
assume that at least up. to the point where sets corresponding to larger 
differences in Z are disjoint, the overlap. between two, stimulus sets 
should be directly related to the interval between the corresponding 
stimuli on the Z scale. These properties, taken together, enable us 
to establish an intuitively reasonable-correspondence between~character- 
istics of a sequence of stimulus sets and the empirical notion of general~ 
ization along a dimension. 

These ideas are incorporated more formally in the following set 
of axioms. The basis for these axioms is a stimulus dimension Z ,. 
which may be either continuous or discontinuous, a collection S, of 
stimulus sets, anda function x(Z) ;. having a finite number of consecu- 


tive integers in. its. range.:.-The mapping of the set (x) of scaled 
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stimulus values onto the subsets (S,) of S, must satisfy the 


* 


axioms: 


Gl. For all i 


1A 


J 


IA 


k ie) 8,8, C 8, 
G2. For all i<j<k in (x), if 8,8, 46, where 4 is the 
null set, then 8, ¢ (8,US,) . 
G3. For allh<i, j<k dan (x), if i-h=k-j, then Mag = Nye 3 
and for all i in (x), Nj, =. 
The set (x) may simply be a set of Z scale values, or it may be 
a set of Z values rescaled by some transformation. The reasons for 
introducing (x) are twofold. First, for reasons of mathematical sim- 
plicity we find it advisable to restrict ourselves, at least for present 
purposes, toa finite set of Z values, and therefore to a finite col- 
lection of stimulus sets. Second, there is no reason to think that equal 
distances along physical dimensions will in general correspond to equal 
overlaps between stimulus sets. All that is required, however, to make 
the theory workable is that for any given physical dimension, wavelength 
of light, frequency of a tone, or whatever, we can find experimentally a 
transformation x such that equal distances on.the x scale do corres- 
pond to equal overlaps. 


Axiom Gl states that if an element belongs to any two sets it also 
belongs to all sets which fall between these two sets on the x scale. 
Axiom G2 states that, if two sets have. any common elements then. all of the 
elements of any set falling between them. belong to one or the other (or 
both) of the given sets; this property ensures that the elements drop out of 


the sets in order as we move along the dimension. Axiom G3. states. the 
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property whiten Alacinaitened a simple substitutive dimension from ‘an 
additive, or intensity (in Stevens' terminology, prothetic) dimension. 

It should be noted that only if the number of values in the range of 
x(Z) is no greater than N(S,) ~ N+ 1 can Axiom G3 be satisfied. This 
restriction: is necessary in order to obtain a one-to-one mapping of the 
x values into the subsets (s;) of Sy « 

One advantage in having the axioms set forth explicitly is that it 
then becomes. relatively easy to design experiments bearing upon. various 
aspects of the model. Thus, to obtain evidence concerning the empirical 
tenability of Axiom Gl, we might choose a response A, and a set (x) 
of stimuli, including a pair i and: k such that “Pr(A,[4) = Pr(A, [) “= QO, 
then train subjects with stimulus i only until Pr(A, [4) =l, and 
finally test with stimulus k. If Pr(A, |) is found to be greater 
than zero, it must be concluded, in terms of the model, that 8.9) 5, # p3 
i.e., the sets corresponding to i: and k have some elements in common. 
Given Pr{a, 1k) > O, it must be predicted that for every stimulus j 
in (x) such that i<j<k, Pr(A, | 4) > Pr(A, [k) . Axion Gl ensures 
that all of the elements of 5, which are eis conditioned to Ay by 
virtue of belonging also to 5, must be included in Bs » possibly 
augmented by other elements of 5 which are not in 8. ° 

To deal similarly with Axiom Gé, we proceed in the same way to 
locate two members i and k of a set (x) such that 8,5, # p.- 
Then we train subjects on both stimulus i and stimulus k until 


Pr(A, |4) = Pr{A, |k) = 1, response A, being one which before this 


‘L 
training had probability of less than unity to ali-stimuli in (x). 
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Now, by G2 if any stimulus j falls between i and k , the set S, 
must be contained entirely in the union 8,Us, 3 consequently, we must 
predict that we will now find Pr(A,|3) = 1 for any stimulus j such 
that i<j<k. 

‘To evaluate Axiom G3 empirically we require four stimuli h<i,j<k, 
such that i-h=k- j. If the four stimuli are all different, we can 
simply train subjects on h and test generalization to i, then train 
subjects to an equal degree on j and test generalization to k. If 
the amount of generalization, as measured by the probability of the 
test response, is the same in the two cases, then the axiom is supported. 

, In the special case when. h=i and j =k , we would be testing the 

' assertion that the sets associated with different values of x are of 
equal size. To accomplish this test, we need only take any two neighbor- 
ing values of x, say i and j , train subjects to some criterion on 
i and test on j , then reverse the procedure by training (different) 
subjects to the same criterion on j and testing on i.. If the axiom 
is satisfied, the amount of generalization should be the same in both 
directions. 

Once we have introduced the notion of a dimension, it is natural 

to inquire whether the parameter which represents the degree of commun- 
ality between pairs of stimulus sets might not be related in some simple 
way to a measure of distance along the dimension. With one qualification, 
which we will mention later, the quantity as, =1l-w,. could serve as 


Aj 


a suitable measure of the distance between stimuli i and gj. We can 
check to see whether the familiar axioms for a metric are satisfied. 


These axioms are 





i 
i 
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1. d,,=0 ifand only if i= 33 
2; tye 9 

3. 455 455 3 

A, a, + dig 2 Ayy 


where it is understood that i, j, and k are any members of the set 


(x) associated with a given dimension. 
hold, but the fourth requires a bit of analysis. 


we shall use the notation q,; 


S85 and 5, 5 Ny 5 


for the number of elements in both 5, 


The first three of these obviously 
To carry out a proof, 


for the number of elements common. to 


and §, 
J 


but not in Sand so on. The difference between the two sides of the 


inequality we wish to establish can be expanded in terms of this nota- 


tion as follows: 


j Nix ik 
gy Og, ayy i otis (al ean laces 
1 
= HON My Nye * My) 
= (yn, N,.- +ik., + kh. - N, 
r NYigjk ij ijk ijk ijk 
~ My * Sage + SiGe 
ede 


Way * Nagy) 


- XN, 


ijk 


- X, 


ijk 


The. last expression on the right is non-negative, which establishes the 


desired inequality. 
let us assume that 


on the dimension. 


To find the restrictions under which d is additive, 


stimuli i, j, and k fall in the order 


Then, by Axiom Gl, we know that Nj 


eT Oe 


i<gj<k 


However 





i 
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it is only in the special cases when Ss, and. 5, 


‘lapping or adjacent that Ny 55 = 0, and,therefore,. that ayy diy = dey 


are ‘either. over- 

It is possible to define an additive distance measure which is not 
subject to this restriction, but such extensions raise new problems and 
we shall not be able to pursue them here. 

In concluding this section, we should like to emphasize one dif- 
ference between the model for generalization sketched here and some of 
those already familiar in the literature (see, e.g., Spence, 1936; 

Hull, 1943). We do not postulate a particular form for generalization 

of response strength or excitatory tendency. Rather, we introduce 
certain assumptions about the properties of the set of stimuli associated 
with a sensory dimensions; then we take these together with learning 
assumptions and information about reinforcement schedules as a basis for 
deriving theoretical gradients of generalization for particular types 

of experiments. Under the special conditions. assumed in the example 
considered above, the theory predicts a family of linear gradients with 
very simple properties will be observed when response probability is 
plotted as a function of distance from the point of reinforcement. Pre~ 
dictions of this sort may reasonably be tested by means of experiments 

in which suitable measures are taken to meet the conditions assumed in 
the derivations (see, e.g. Carterette, _19645:). But to deal with 
experiments involving different training conditions, or response measures 
other than relative frequencies, further theoretical analysis a called 
fors and one must be prepared to find substantial differentes in the 
phendtypte properties of generalization gradients derived from the dare 


basic theory for different experimental situations. 
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5. COMPONENT AND LINEAR MODELS FOR SIMPLE. LEARNING 


‘In this section we combine, in a sense, the theories discussed in 


the preceding sections. Until now it was convenient for expositional 
purposes, to threat the problems of learning and Waid paiieeerc canacatelys 
We first considered a type of learning model in which the different 
possible samples of stimulation from trial to trial were assumed to be 
entirely duetiade, and then turned to an analysis of generalization, or 
transfer, effects that could he measured on an isolated test trial follow- 
ing a series of learning trials. Prediction of these transfer effects 
depended on information concerning the state of the stimulus population 
just prior to the test ae oak did not depend on information about the 
course of learning over preceding training trials. However, in many 
(perhaps niost) learning situations, it is ae reasonable to assume that 
the samples, or patterns, of stimulation affecting the organism on 
different trials of a series are entirely disjoint; rather, they must 
overlap to various. intermediate degrees, thus generating transfer effects 
throughout the learning series. In the “component models" of stimulus 
damning theory, one simply — the learning assumptions of the pattern 
model (Sec. 3) together with the sampling axioms.and response rule of the 
generalization model (Sec. 4) to generate an account of learning for this 
more general case. 

5-1” Component Models with Fixed Sample Size 

As indicated earlier, the analysis of a simple learning experiment 


in terms of a component model is based on the representation of the 
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stimulus as a set S of N stimulus .elements from which the subject 
draws a sample on each trial. At any time each element in the set §S 
is conditioned to exactly one of the r response alternatives Aya eA, 3 
by the response axiom of Sec. 4.1 the probability of a response is equal 
to the proportion of elements in the trial sample conditioned to that 
response. At the termination of a trial, if reinforcing event B, (i # 0) 
occurs, then with probability c all elements in the trial sample become 
conditioned to response A, . if Eo occurs the conditioned status of 
elements in the sample does not change. The conditioning parameter c 
plays the same role here as in the pattern model. It should be noted 

that in the early literature of stimulus sampling theory, this parameter 
was usually assumed to be equal to unity. 

Two general types of component models can be distinguished. For 
the fixed sample size model we assume that the sample size is a fixed 
number s throughout any given experiment. For the independent sampling 
model we assume that.the elements of the stimulus set, 5S, are sampled 
independently on each trial, each Stead having some fixed probability 
8 of being drawn. In this section we discuss the fixed sample size model 
and consider the case in which all possible samples of size s are 
sampled with equal probability. 

Formulation for RIT Experiments. To illustrate the model we first 
consider an experimental procedure in which a particular stimulus item is 
given a single reinforced trial followed by two consecutive nonreinforced i 
test trials. The design may be conveniently symbolized RT,T, . Pro- 


Le 


cedures and results for a number of experiments using an . RIT design 
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have been reported elsewhere (Estes, 1960a; Estes, Hopkins and Crothers, 
1960; Estes, 1961b; Crothers, 1961). For simplicity, suppose one selects 
a- situation in which the probability of a sooreet response is zero before 
the first reinforcement (end in which the likelihood of a subject's 
obtaining eorrect responses by guessing is negligible on all trials). 

In terms of the fixed sample size model we can readily generate predic-— 
tions for the probabilities, Pay » of various combinations of response 

i oon T, and response j on T, .- te i,j = O denotes a correct 


1 2 


response and i,j = 1 denotes an error then i 


Poo = (5 
Po. * e( §) [2 “nl 

(55) 
Py = e(t = HF 


és 2 
Ll-e + e[2 - =| . 


Pay 


To gieain thé first result, we note that the correct response, can occur 
on either trial only if conditioning occurs on the reinforced trial, 
which has probability c¢ “ On occasions when conditi oning occurs, the 
whole sample of s ai ahadte becomes gandtekenee te the guieeke response 
and the probability of this response on each of the test trials is 7 . 


On occasions when conditioning does not occur on the reinforced trial, 


probability of a correct response remains at zero over both test trials. 
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Note that when s=Ne=1 this model is equivalent to the one-element 
model, discussed in Sec. 2.1. ‘Ie more than one reinforcement is given 
prior to Tt ,» the predictions are essentially unchanged. In general, 
for k preceding reinforcements, the expected proportion of elements 
conditioned to the Sorpent response (i.es, the probability of a correct 


response) at the time of the first test is 


cane 
Potl-G@-+ 2 


and the probability of correct responses on both Ty) and TS is given 


by 
= (F] fa. ey*f- sy" 
Poo = nT a} oe z i “N : 


To obtain this last expression, we note that a subject for whom i of 
the k reinforcements have been effective will have probability 

i 
[2 - (Ll - a? ] of making a correct response on each test, and the 


probability that exactly i reinforcements are effective is 


re o*(2 = Byer . Similarly, 
Kok) 4 kei Pe ek 
Pro" Fa = 2 [i] eo - od Le Gives) (lQ-F). > 
and -: 
ie eye k-i ,,_ 8,1 
Puy m (1 - e) +24] e (1-c) (1-3) * 
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If s = N , these expressions reduce to 


k 
Pog = 1 - 2 ec)” - . : i 
Pig = Pg = 0 
ee 
Py tb - ey. 


This special case appears well suited to the interpretation of data 
obtained by G. H. Bower (personal communication) from a study in which 
the TT, procedure was applied following various numbers of presenta- 
tions of word-word:paired-associates. For 32 subjects each tested-on 
10 items, Bower reports.observed proportions. of Poo = 894 , 
Pig = Po. = .003 , and Put = 100. 

When applied to other RTT experiments, this model has, however, 
not yielded consistently accurate predictions. The difficulty apparently 
stems from the fact that our assumptions do not take account of the 
retention loss that is usually observed from Ty) to % (see, e.g., 
Estes, 1961b). An extension of the model which is capable of handling 
retention decrement as well as the acquisition process will be discussed 


in Sec. 5.2 below. 


For RIT experiments in which the probability of successful guessing 





is not negligible (as in paired-associate tasks involving a fixed list 
of responses which are known to. the subject from the start) some addi- 


tional considerations arise. Perhaps the most natural extension of the 





preceding treatment is to assume that the subject starts the experiment 
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with a proportion = of the elements of a given set S, connected to 
the correct response and a proportion (1 - =) connected to incorrect 
responses, xr being the number of alternative responses. Then for a 
fixed sample size model, the probability, Po» of a correct response to 


a given item on the first test trial after a single reinforcement is 


x i s+ (N-s)/r 
Po = (1 ~ ce) me | T 
es) 1, 28 
- 4 r* Wy ? 


the bracketed quantity being the proportion of elements connected to the 
correct response in the event that the reinforcement is effective. Then 
the probabilities of various combinations of correct and incorrect responses 


on the two test trials are given by 


(1 -'e) 4+ +e e 


Poo = R 

Pig = Pg = (L-ey ZCA-S) +e (1-9) (56) 
1 2 

Py = (lre)(l- 3) +e(2-9) , 


s s, 1 
where @= y+ (1 - YP a 
An alternative approach to the type of experiment in which the 
subject guesses on. unlearned items is to assume that initially all 
elements are neutral, i.e., are connected neither to correct nor to 


incorrect responses. In the presence of a sample containing only neutral 
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elements, the subject guesses, with probability = of being correct. If 
the sample contains any conditioned elements, then the proportion of condi- 
tioned elements in the sample connected to. the correct response determines 
its probability (e.g., if the sample contains nine elements, three condi- 
tioned to the correct response, two conditioned to an incorrect response, 
and four unconditioned, then the probability of a correct response is simply 
3/5). These assumptions seem in some respects more intuitively satisfactory 
than those considered above. Perhaps the most important difference with 
respect to empirical implications lies in the fact that with the latter 
set of assumptions, exposure time on test trials must be taken into ac- 
count. If the stimulus exposure time is just long enough to permit a 
response (in terms of the theory, just long enough to permit the subject 
to draw a single sample of stimulus elements), then the probabilities of 


correct and incorrect response combinations on T, and Bp are 


1 
1 2 
Poo = (l= c) —s +c! , 
r 
Pro = Py = (2 - ce) (2-4) +e tll - 9") (57) 
10 OL r Yr ss 


1 2 2 
Py = (L-e)(Q-3) tell-9') > 


ihe te | 
where o' = 1- (1 - bed i » The factor ir 
s s 


that. the subject draws a sample containing none of the s elements that 


is the probability 


became conditioned on the reinforced trial; therefore 1-' represents 
the probability. that a subject for whom the reinforced trial was effective 
nevertheless draws a sample containing no conditioned elements and makes 
an incorrect guess, whereas ' is the probability that such .a subject 


makes a correct response on either test trial. 
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The two sets of equations 56 and 57 are formally identical, and thus 
cannot be distinguished in application to RTT data. Like Equation 55, 
they have the limitation of not allowing adequately for an retention 
loss usually observed (see, e.g., Estes, Hopkins, and Crothers, 1960); 
we shall return to this point in Sec. 5.2. 

. If exposure time is sufficiently long on Sie: test ree then ue 
assume that the subject continues to draw successive random samples from 
S and only makes a response when he finally draws ersenbie containing 

at least one conditioned element. Thus, in cases in which the reinforce- 
ment has been effective on a previous trial (so that S contains Saas 
set of 5 Cond ty snaed eteuEnee)s the subject will eventually draw a 
ealple containing one or more conditioned elements and will respond on 
the basis of these elements FECERY making a correct He SPORBE: with prob- 
abiiity a . Therefore, for the case of unlimited exposure time, 


‘9! = 1 and Eq. 57 reduces to 


ie) 
~ 


= (1 -c) + + 
aa 


(1 oy nt ? (58) 


Po, = (1 - ¢) 


Pp HIF 


rye (l-e)GQ-4) , 


which are, identical with the corresponding equations for the one-element 
model of Sec. 2.2, 

General Formulation. “We turn now to the problem of deriving pre- 
dictions from the fixed sample size model concerning the course of 


learning over an experiment consisting of a sequence of trials run 
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under some prescribed reinforcement schedule. We shall limit considera- 
tion to the case in which each element in $ is conditioned to aeachiy 
one of the two response alternatives, AL or Ay » 80 that there are 
N+ 1 conditioning states. Again, we let ci (4 = 0,.-,N) denote the 
state in which i elements of the set S are conditioned to AL and 
N-i to A, . As in the pattern model the transition probabilities 
among conditioning states are functions of the reinforcement schedules 
and the set-theoretical parameters c, s, and WN. Following our 
approach in Sec. 3,1, we. shall restrict the analysis to cases in which 
‘dhs. geobabalaty of reinforcement depends at most upon the response on 
the given trial; we thereby guarantee that all elements in the transi- 


tion matrix for conditioning states are constant over trials. ‘Thus the 


‘sequence of conditioning states can again be conceived as a Markov chain. 


Transition Probabilities. Let By a denote the event: of drawing 
a 
@ sample on trial n with i elements conditioned to AY and s-i 
conditioned to Ap « Then the probability of ‘a one-step transition from 
state c. to state Cary is given by 
v gs -V¥ 
a, Pr(Ey|9,_,C;) » (59a) 


dj, dev * © ("] 


where Pr(Byls,_yC;) is the probability of an E. event given condi- 


1 


tioning state c, and a sample with V elements conditioned to Ay é 
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To obtain Eq. 59a we note that an E must occur and that the subject 


1 





must sample exactly v elements from the N- j elements not already 
conditioned to AL 3. the probability of the latter event is the number 


of ways of drawing samples with Vv elements conditioned to Ay divided 





by the total number of ways of drawing samples of size s . Similarly 
iz ele) 
s- viivi. 
sa gy FC P(E, Cc. Tb 
Gy iy r(Bp|8C,) (59) 
8 
and 
A ie 
sl, 5 
Gj 7h-ete [my Preealencs) + rF] Pr(E,|8,C,) 
: s 8 
+Pr(Eylc,)}  - . (59¢) 


Although it. is an obvious. conclusion, it is important for the reader to 
realize that the pattern model discussed in Part 3 is identical to the 
fixed sample size model when s=1. This correspondence between the 
two models is indicated by the fact that Eq. 59 reduce to Eq. 23 when we 
let s=l. 

For the simple noncontingent scheduie in which only the two events 


E. and & oceur (with probabilities .« and 1-41 ) respectively) 


i 2 
Eqs. 59a to 59c simplify. to 
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a , (60a) 
N- j a 
Gy g.gselt Sat a (600) 
(2) 
Hl aoa fd 
q el-ecteln tet + (l-x) ———]. (60c) 


= a 


(3) 


It is apparent that state Cy is an absorbing state when a = 1 and 


Co is an absorbing state when x = 0. Otherwise all states are ergodic. 
Mean Learning Curve. Following the same techniques used in connec- 
tion. with Eq. 27 we obtain for the component model in the simple,’ non- 


contingent case 


és nel e 
Fr(A, .) =n - E - P(t, | [2 - 2| ‘ (61) 


This mean learning function traces out a smooth growth curve that can 

take any value between O and 1 on trial n if parameters are selected 
appropriately. However, it is important to note that for a given reali- 
zation of the experiment the actual response probabilities for individual 
subjects (as opposed to expectations) can only take on the values 0, 

L. oe, N-1 £ ‘ . (eruoe 
We Wee ae 13 1i.e., the values associated with the conditioning 
states. This step-wise aspect of the process is particularly important 


when one attempts to distinguish between this model and models that 


assume gradual continuous increments in the strength or probability .of 
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a@ response over time (Hull, 19h3; Bush and Mosteller, 1955; Estes 
and Suppes, 1959a). : 

To illustrate this point we consider an experiment on avoidance 
learning reported by Theios (1961). Fifty rats were used as subjects. 
‘The apparatus was a modified hei Meeres electric shock box. The 
animal was always placed in the black compartment; shortly thereafter 

, @ buzzer and light came on as the door between the compartments was 
“gasnene The correct response (A,) was to run into the other compart- 
ment. within 3 seconds. If AY did not occur the subject was given a 
high intensity shock (255 volts) until it escaped into the other com- 
partment. After. 20 seconds the subject was. returned to the black com- 
partment, and another trial was given. ‘Each rat was run until it met 
a criterion of 20 consecutive successful avoidance responses. 
Theois analyzes the situation in terms of a component model.in 


which Ns2 and s=1. Further, he assumes that Pr(A =Q and hence 


1,0) 


on trial 1 the subject.is in conditioning state C Employing Eq. 60 


0° 


with m«a=1, N=2, .and:.s=l we obtain the following transition matrix: 


5 1 % 
Cy 0 0 
cy °/2 1 = ¢/2 fo) 
Cc - . 
0 ie) £ Loe 


And the expected probability of an A 


4 response on trial n is readily 


obtained by specialization of Eq. él, 
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Applying this model Theios estimates c= .43 and provides an impressive 
account of such statistics as total errors, the mean learning curve, 
trial number of last error, autocorrelation of errors with lags of 1, 

2, 3 and 4 trials, mean number of runs, probability of no reversals, and 
many others. However, for our immediate purposes we.are interested in 
only one feature of his data; namely, whether the underlying response 
probabilities are actually fixed at 0, - and 1 as specified by the 
model. First, we note that it is not possible to establish the exact 


trial on which the subject moves from fo to C¢ or from Cy to Cy . 


1 
Nevertheless, if there are some trials between the first success (A, 
response.) and the last error (A, response), we can be sure that the 
subject isin state Cy on these trials. For if the subject has made 
one success, at least one of the two stimulus elements is conditioned 

to the AL response; if on a later trial the subject makes an error, 

then, up to that trial, at least one of the elements is not conditioned 


to the Ay response. Since deconditioning does not occur in the present 


model, the subject must be in conditioning state Cy » Thus, according 


to the model, the sequence of responseSafter the first success and before. 


the last error should form a sequence of Bernoulli trials with constant 
probability p=q = z of an Ay response. Theios has applied several 
statistical tests to check this hypothesis and none suggest that the 


assumption is incorrect. For example, the response sequences for the 


trials between the first success and last error were divided into blocks 
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of four trials and the number of AL responses in each block was counted. 
The obtained frequencies for 0,1, 2, 3 and 4 successes were 2 , a 


12,17. 15., and 4, respectively; the predicted binomial frequencies 





were 3.1, 12.5, 18.5, 12.5 and 3.1 « The correspondence between pre- 


2 
dicted and observed frequencies is excellent as indicated by a X 





goodness-of-fit test. that yielded.a value of 1.47 with 4 degrees of 
|. freedom. 
Theios has applied the same analysis to data from an eheure by 
Solomon and Wynne (1953) where dogs were required to learn an avoidance 
response. The findings with regard to the binomial property on trials 
after the first success and before the last error are in agreement with # 
_his own data but suggest that the binomial parameter is other. than z . 
From a stimulus sampling viewpoint this observation would suggest. that - 
the two elements are not sampled with equal probabilities. For a detailed 
discussion of this Bernoulli step-wise aspect of certain stimulus sampling 
models, related ‘ghatistieal: tests, and a review of relevant experimental 
data the reader is referred to Suppes and Ginsberg (1962a). 
The mean learning. curve for the fixed. sample size model given by 
Eq. 60 is identical to the corresponding equation for the pattern model 
with.the sampling ratio 7 taking the role of = + However, we need 
not. look far to finda difference in the predictions generated by the 


two models. If we define a n 3 in Eq. 29; i.e., 
2 





i 


N 
on a pil Z Pr(C, ,) ee 


is0 
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then by carrying out the summation using the same methods as in the case 


of Eq. 27, we obtain 
e|s s{s ali s ae 
=|= - _ = eae 2 
* ‘lf WN - 1 iene + 2estir Me J toned (62) 


The asymptotic variance of the response probabilistics for the component 


model is simply 
2 
2 
[qm * % ow 7 Pes, «)] 
Letting % on = Ob nel, = ae » noting. that Pr(Ay oo) =a, and 
carrying out the appropriate computations we obtain 


be ns x(l - aI + (N ~2)2| é (63) 


N 2anN.--s-1 


This asymptotic variance of the response probabilities depends in 

relatively simple ways on s and N. If we hold N fixed and dif- 
ferentiate with respect to s , we find that on increases monotoni- 
eally with s ; in particular, then, this variance for a fixed sample 


size model with s>1 is larger than that of the pattern model with 





the same number of elements. If we hold the sampling ratio i fixed 


2. 
and take the partial derivative with respect to N , we find Fo to 
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be a decreasing function of N .' In the Limit, if N-—-o in such a 
way. that i = @ remains constant, then 
2 oN 6 
OS (lL - a) soy, (64) 


which, we will see later, is the variance for the linear model (Estes 
and Suppes, 1959a). In contrast, for the pattern model the variance of 
the p values approaches’ 0 as N becomes large. We return. to 
comparisons between. the two models in Sec. 5.3. 

Sequential Predictions. We now examine some sequential statistics 
for the fixed sample size model that later will help clarify relationships 
‘among the Weokous stimulus sampling models. In particular, we consider 


the probability of an. A, response given that on the preceding trial 


1 
Ey ’ Ey or Ey occurred. 


Consider, first Pr(A E . By taking account of the condi- 


nn) 
tioning states on trial n+ 1 and trial n and also the sample on trial 


L ntl! 


n we may write 


Pr(A |z 


1 , : 
1ne1! Bin) = BEE 2 PrlAy ares neta n®aynckn) ? 


ee 1,g,k 


where, as before, 55 on denotes the event of drawing a sample on trial 
? 


n with i elements conditioned to Ay and s - i conditioned to Age 


Conditionalizing, with our learning axioms in mind, we obtain 





























H 
j 
i 
| 
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Pr(Ay net !Ex in 1 fe EY Jos Pr(Ay yn! pat PE(C, pnt 21 ns, nk, n) 
a ISP 


° Pr(E) )Pr(s5 pPr(c 


Pca nl x, ies? . 


Our reinforcement procedures depend.at most on the responses of the 


subject and ‘hence Pr(E ) = Pr(B [s ).. Further 


i; n°k, n 


k+s-i 


iT} 


e at J 


Pr(c, Move: AP: tack 


3 nt By 8s nen 


10] otherwise . 


That is, the s - i elements in the sample originally conditioned to 


Ay now become conditioned to A, with probability c and hence a 


move from state C, to Caged occurs. Also, as noted with regard to 
Eq: 59, 
efi 
Pr(s By ln )e= =a = : 
s 


Substituting these results in our last expression for Pr(A |Z, _) 
: ine! La 


yields 


N-k 
Vilts - a7 


ot tle 








7 —] > kes -i 
Fey walB Fo [2 RR ao} 
2 


ele 


Pr(¢, ) 
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We now need the fact that the first raw moment of the hypergeometric 


distribution is 


: ral - Hae 
4 s- i] _ sk ; 
mm Pp 
s 
permitting the simplification 
cs 
Pr(Ay yale = [2+ - a- 9] Pr(G, ) ; 


But by definition 


whence 


cs Cs. 
Pr(Ay nya!) =(1-3) Pr(Ay ) ee C8 (65a) 
By the same method of proof we may show that 
Pr(A lz, .) = (1 - S) pr(a, _) (65b) 
L,n+Ll'"2,n N 1,n 3 
Pr(Ay nia, n) = Pr(A, ,) ‘ (65c) 


Finally, for comparison with other models, we present the expressionsfor 


Pr(Ay nity Ad n) » As in previous cases (e.8., ‘Eq. 31a), we give 
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results only for the noncontingent situation in which Pr(E,_) = 0 
O,n 
and r=2 , Derivations of these probabilities are based on the same 
methods used in connection with Eq. 61a. 
[ ) Le: 
> = els zd c aa ‘ 
Ay ni By nAd in) = ™ b N-1 a 2) oy el (662) 
fos ac s ~.1) 
p. = nef 1 = 2 6 
TOA) aah nen) aes yt in N-1 (On % yer) 
. pa Wale a4 es c(s- 1) 
Pr(Ay 41% Ayn) = (1 ofa les, n |g. N- 1 a, (e6e) 
=e . o(s=1) 2 6 
Pr(Ay nie non) = of. Wet (in? Sn) (664) 
p- = cle) = . 
28S 48 Ayn) = «pp-$ fet) | ha (66e) 


Pre 418 nha, n) ‘ «f(a oe Sey ° [>- wk -4,,.) —- 


Plt waPa ata) = C-adf fae ge MeaMas -fa-shela, bce) 


Pr(Ae nie n,n) ~ Q-of- Bets n -f. 252m, =} ; seen) | 
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Application of these equations to the corresponding set of trigram 
proportions for a pre-asymptotic trial block is not particularly reward- 
ing. The difficulty is that certain combinations of parameters, e.g., 


e{s~ 1) cs A 
WoL V(q  % 2) and = , behave as units; consequently, the 


(1 - 
basic parameters ¢,s6, and N cannot be estimated individually and, 

as a result, the predictions available from the simpler N-element, pattern 
model via Eg. 32 cannot be improved upon by use of Eq. 66. For asymptotic 


data, the situation is somewhat different. By substituting the limiting 


values for on and Me in Eq. 66, i.e., man and from Eq. 63 


a = oF ae = Mam) (2 )s), a[N- 25+ Ns+ 2x N-s)(N-1 [ 
00 N eN-s-1 ’ N(eN-s-1 


we can express the trigram RE Cn ebenn teeny Pr(A,. 085, coi, oo) in terms 
of the basic parameters of the model. The resulting expressions are 
somewhat cumbersome, however, and we shall_not pursue this line of 


analysis further in the present article. 
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5.2 Component Models with Stimulus Fluctuation 

In the preceding section, as in most of the literature on stimulus 
_ sampling models for learning, we restricted attention to the special case 
in which the stimulation effective on successive trials of an experiment 
may be considered to represent independent random samples from the 
population of elements available under the given experimental conditions. 
More generally, we would expect that the independence of successive 
samples would depend on the interval between trials. The concept of 
stimulus sampling in the model corresponds. to the process of stimulation 
in the empirical situation. Thus sampling and re-sampling from a stimulus . 
population must take time; and if the interval between trials is suffi- 
ciently short, there will not be time to draw a completely new sample. 
We should expect the correlation, or degree of overlap, between succes- 
sive stimulus samples to vary inversely with the intertrial interval, 
running from perfect overlap in the. limiting case (not necessarily 
empirically realizable) of a zero interval to independence at suffi- 
ciently long intervals. These. notions have been embodied in the 
stimulus fluctuation model (Estes, 1955a, 1955b, 1959a). In this section, 
we shall develop the assumption of stimulus fluctuation in connection 
with fixed sample size models; consequently, the expressions derived 
will differ in minor respects from those of the earlier presentations 
(cited above) which were not neHeeehed to the case of fixed sample size. 

Assumptions and Derivation of Retention Curves. Following the - 
convention of previous articles. on stimulus fluctuation models, we shall 


denote by S* the set of stimulus elements potentially available for 


A. and. EB.  -156- 


sampling under.a given set of experimental conditions, by S the ‘subset 
of elements available for sampling at any given ‘time and by S' the sub- 
set of elements that are temporarily unavailable (so that $* = sUs') . 
The ‘trial sample, s’, is in turn a subset of S°; however, in this 
presentation we shall assume for simplicity that all of the temporarily 
available elements are sampled on each trial (i.e., S=s ) . We denote 
by N, N', and N*, ‘peapastively, the numbers of elements in s , 
S' 5 and s*. 

The interchange between the stimulus sample and the remainder of 
the population, i.e., between -s and S' , is assumed to occur at a 
constant rate over time. Specifically, we assume that during an inter- 
val -At - which.is just long enough to permit the interchange of a single 
element between s and S! » there is probability g ‘that such an 
interchange will occur, the parameter e being constant over time. 
We shall limit consideration to the special case in which all stimulus 
elements are equally likely to participate in an interchange. With this 
restriction, the fluctuation process can be characterized by the 


difference equation, 


£(t +1), 


(1g) £(t) + aft(t)(2-%) #02 - £(t))%,] 


Ut) 


[1 - aS +%,)] 2(t) +8, , (61) 


where f(t) denotes the probability that’ any given élement.of S* is 


in sat time t'... This recursion can be solved by:standard methods 
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to yield the explicit formula 
‘N 


e(t) = 3, - R, - c(oydta - ee + Z)° 


pete ROT a” (68) 


N 
where J =x, , the proportion of all elements which are in the sample, 


N* 
and a=] - a(S + =.) : 

Equation 68 can now serve as the basis for deriving numerous 
expressions of experimental interest. Suppose for example, that at the 
end of a conditioning (or extinction) ere were j, conditioned 
elements in S and Ky eonditioned elements in S' ,. the momentary 
probability of. a conditioned response thus being Po = Jg/ N. To ob- 
tain an expression for probability of a conditioned response after a 
rest interval of duration t , we proceed as follows. For each condi- 
tioned element in S at the bealnning of the interval, we need only set 
£(0) = 1 in Equation 68 to. obtain the probability that the element is 
in S at time t. Similarly, for a conditioned element initially in 


S'., we set £(0) = 0: in Equation 68. Combining the two types, we 


obtain for the expected number of conditioned elements in S at time t , 


jp to - (S-1) a’] + ie x(1 -a*) 


= (Ig + Ko)T - [Gg + Kg) = dol 2”. 
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Dividing by WN (and noting that J = ©) we have, then, for the 


probability of a conditioned response at time t, 


280° *o. (Yeo So. at 
Py = We Ne Po 
= pk - (px - p,) a (69) 
6 67 Po , 


where a and Po denote the proportion: of conditioned elements in 
the total population 8* and the initial proportion in 5, respec- 
tively. If the rest interval begins following a conditioning period, 

we would ordinarily have Po > % » in which case Equation 69 describes 
a decreasing function (forgetting, or spontaneous regression)... If the 
rest interval begins following alee extinction period, we would have 

Po < 19 » in which case Equation 69 describes an increasing function 
(spontaneous recovery). The manner in which cases of spontaneous. re- 
gression or recovery depend on the amount.and spacing of previous 
acquisition or extinction has been discussed in detail elsewhere (Estes, 
1955a). : 

Application to the RIT Experiment. We nated in the preceding 
section that the fixed sample size model could not provide a generally 
eaidapaetony account of RIT experiments because it did not allow for 
the retention loss usually observed between the first and second tests. 
It seems reasonable that this defect might be remedied by removing the 


restriction on independent sampling. To illustrate application of the 
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more general model with provision for stimulus fluctuation, we shall 
again consider the case of an RTT experiment in which the probability 
of a correct response is negligible prior to the reinforced trial (and 


also on later trials if learning has not occurred). Letting ty and 


ty denote the intervals between R and TY and between qT and T ’ 


respectively, we may obtain the following basic expressions by setting 


f£(0) equal to 1 or O, as appropriate, in Equation 68: 


For the probability that an element sampled on R is sampled 


again on qt ’ 
t 


f) 20+ (1-c)a* . 


for the probability that an element sampled on Tt is sampled 
again on T, , 


ty 
fp=04+(L-d)a* ; 


and for the probability that an element not sampled on qT) is 


sampled on Tp ? 


t, 


= o(1-a" 


fs yos 


i 
} 
i 
i 


Assuming now that —N=1, so that we are dealing with a generalized 
form of the pattern model, we can write the probabilities of the four 
in 


combinations of correct and incorrect responses on Ty and T 
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terms of the conditioning parameter c and the parameters fs : 


Po =.¢ fif, 


Po, =c Lanes -f) 


= e(L -£,) fs 


L-c + c(1-f,)(1 ~f,) F (70) 


where, as before, the subscripts O and 1 denote correct responses 
and errors, respectively. As they stand, Eq. 70 are not suitable for 
application to data, for there are too many parameters to be estimated. 
This difficulty could be surmounted by adding a third test trial, for 


the resulting eight observation equations, 


2 

Poo0 = ° Fyf2 > 

Bea ees) 
Pog = ¢ £,(1- £5) tz 5 


etc., would permit overdetermination of the four parameters. In the 
case of some published studies (e.g., Estes, 1961b) the data can be 


is approximately unity, 


handled quite well on the assumption that a 


in which case Eq. 70 reduce to 
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Pog = & fo > 
Po, = c(1-f5) » 
Big = 89 

Piz zie ° 


In the general case of Eq. 70, some predictions can be 
made without information as to exact parameter values. It has been 
noted in published studies (Estes, Hopkins and Crothers, 1960; Estes, 
1961b) that the observed proportion Po is generally larger than 
Pio = Taking the difference between the theoretical expressions for 


these quantities, we have 


u 


Po. - Pip = ¢ £,(1- 5) - e(1-f,) fs 


t av 
of + (1-g) a *}(1-3)(1-a *) 


u 


t t 
wre 6(1-='3) (ie 2) a0 se 7) 


t, : t. t 
dota “\(e+ Gd) at Saiee 4) 


t, t 
e(1-J)(1-a 2) at 5 


u 


which obviously must be equal to or greater than zero. The experiments 


cited above have in all cases had ty < ty » and therefore f, > fo . 
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Since fb » Which is directly estimated by the proportions of instances 


in which correct responses on T 


1 are repeated on T 


2? has ranged 


from about .6 to .9 in these experiments (and f, must be larger) 


1, 
it is clear that Pig? the probability of an incorrect followed by a 


correct response, should be relatively small. This theoretical predicr 
tion accords well with observation. 
Numerous predictions can be generated concerning the effects of 


varying the durations of ty and to + The probability of repeating 


a.correct response from T, to Ty » for example, should depend solely 


1 


on the parameter . 2? decreasing as ty increases (and £5 therefore 
decreases). The probability of a correct response on T following an 


incorrect response on T, should depend most strongly on | ts » dn- 


L 


creasing as ty (and therefore f. increases. The overall proportion 


3) 
correct per test should, of course, decrease from qT to Ty (although 


the difference between proportions on T, and Tp tends to zero as 


1 


ty becomes large). Data relevant to these and other predictions are 
available in studies by Estes, Hopkins, and Crothers (1960), Peterson, 
Saltzman, Hillner, and Land (1962), and Witte (R. Witte, personal com- 
munication). The predictions concerning effects of variation of ty 
are well confirmed by these studies. Results bearing on predictions 
concerning variation in ty are not consistent over the set of experi- 


ments, possibly because of artifacts arising from item selection 


(discussed by Peterson, et al, 1962). 
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Application to the Simple Noncontingent case. We shall restrict 


consideration to the special case of N=13; thus, we shall. be dealing 
with a variant of the pattern model in which the pattern sampled on any 
trial is the one most likely to be sampled on the next trial. No new 
concepts are required beyond those introduced in connection with the 
RTT experiment, but it will be convenient to denote by a. single symbol, 
say 8; the probability that the stimulus pattern sampled on any trial 
n, is exchanged for another pattern on trial .n+1. In.terms of the 


notation used above, 
hen eet hi t 
g=l-f, = (1-d)(l-a’) = (1-5, )(1 -a ); 


where: +t is now taken to denote the intertrial interval. Also, we 


denote by wu the probability of the state of the organism in which 


im, n 
m stimulus patterns are conditioned to the A 


1 response and one of 


these is sampled, and by u the probability that m patterns are 


Om, n 


conditioned to A, but a pattern conditioned to Ap is sampled. 


1 
Obviously, 


where, as usual, P, denotes probability of the A, response on trialin . 


z 
Now we can write expressions. for trigram probabilities, following 


essentially the same reasoning used before in the case of the pattern 





model with independent sampling. For the joint event A » we obtain 


Ay 
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m 


-1 
P(A) net xt n) =i 2 “ann [l-g+e “e] 





3 


-g-& SS ; 
x[(1.-g WP * 6 2 an wel 3 


for if an element conditioned to AY is sampled on trial n, then 


with probability 1l-g it is resampled and with probability ¢ Se 
it is replaced by another element conditioned to A and in either 


aL. ? 
event an Ay response must occur on trial n+l. Using the abbrevia- 


: m ; paau-s 
tions and Vv = 2 Monn v » the trigram probabilities 


m 
we » myn NN 


can be written in relatively compact form: 


Sia Rk Ee jen 
Pray Bi niin) = -e-F dp, + yl 


Pry natFeynfiyn) = -OEC(L-eM2-@) - §Jp, + a0.) 


Pr(A, ast fan) = xle(L-e)(2-p,) + e¥,) 


Pr(Ay n41™o ne in) io (1 -x)ev, - 

1 
Pr(Ay +1) Ayn) e nel(1 +5, YP, = U,] ie 
Pr(Ay ni wAin) = (2-#)M (ec ~cet e+ )P, - ,] ; 


Pr(Ay nary nh. yn) = a[(l-¢+cg)(1-p,) - ev,,] 7 


Pry 418 nA yn) (1-x)[1 - Py - av, ] (72) 
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The chief difference between these expressions and the corresponding 
ones for the independent sampling models is that sequential effects 
now depend on the intertrial interval. Consider, for example, the 


first two of Equations 71, involving repetitions of response A It 


1° 
will be noted that both of these expressions represent linear combina- 
tions of Py and UL ». with the relative contribution of Py increas- 
ing as the intertrial interval (and therefore g ) decreases. Also, 
at is apparent from the defining equations tor: Py and Un » that 
pn, 2 Un » With equality obtaining only in the special cases where both 
are equal to unity or both equal to zero. Therefore, the probability 
of a repetition is inversely related to the intertrial interval. In 
particular, the probability that a correct AL or Ay response will 
be repeated tends to unity in the limit as the intertrial interval goes 
to zero. When the intertidal interval becomes large, the parameter g 
approaches 1 -ay » and Equations 71 reduce to those of a pattern 
model with N elements and independent sampling. 

Summing the first four of Equations 71, we obtain a recursion for 


probability of the AL “response: 


p 1 = (L-c-g-$,+ce)p, + c(1-g)x + g(U, +¥,) 3 


nt. 


Now, although a full proof would be quite involved, it is not hard 
to show heuristically that the asymptote is independent of the inter- 


trial interval. We note first that asymptotically we will have 
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- m ent 
Uy = 2 am i = 2 Uy Ie 


2 

-_ mn _ 

ww 2 ] Um yt % on ; 
where u is the probability that m elements are conditioned: to AY . 
The substitution of u es for u is possible in view of the intui- 

m N* im 
tively evident fact that, asymptotically, the probability that an ele~ 
ment conditioned to AL constitutes the trial sample is simply equal 
to the proportion of such elements in-the total population. ‘“Substi- eh 


tuting into the recursion for Py in terms of this’ relation, and the 


analogous one for Va ’ 





we obtain 





Sifats Saale 7 Ne 
Puay = (be cress, teelp, + c(l-ge)a + 6 FB, 


(l-e+eg)p + c(l-g)n 5 





the simplification in the last line having been effected by means of 


the identity 





N+ 1. Nx 
- e-is He elgrnee Bae 
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Setting Pret = Py = Po and solving for Po» we arrive at the tidy 


outeome 


Ke] 
li 


(L-c+cg)p, + c(l-g)x , 


whence 


The recursion in Pp, can be solved, but the resulting formula 
expressing P, 28 @ function of n and the parameters is too cumber-~ 


some to yield much useful information by visual inspection. Tt seems 
ad 1 

N* 
intertrial intervals) the learning curve will rise more sharply on early 


intuitively obvious that for ee 1- (i.e, for any but. very long 
trials than the corresponding curve for the independent sampling case. 
This is so because only sampled elements can undergo conditioning, and 
onee sampled, an element is more likely to beiresampled tha..phorter. the 
intertrial interval. However, the curves for longer and shorter inter- 
vals must cross ultimately, with. the curve for the longer interval 
approaching asymptote more rapidly on later trials (Estes, 1955b). If 
n=l, the total number of errors expected during learning must be 
independent of the intertrial interval; for each initially unconditioned 
element will continue to produce an error each time it is sampled until 
it is finally conditioned, and the probability of any specified number 


of errors prior to conditioning depends only on the value of the 
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conditioning parameter c. Similarly, if nxn is set equal to 0 
after a conditioning session, the total number of conditioned responses 


during extinction is independent of the intertrial interval. 


5.3 The Linear Model as a Limiting Case 

For those experiments in which the available stimuli are the same 
on all trials, the possibility arises of using a model that suppresses 
the concept of stimuli. In such a "pure" reinforcement model the 
learning assumptions specify directly how response probability changes 
on @ reinforced trial. By all odds the most popular models of this 
sort are those which assume probability of a@ response on a given trial 
to be a linear function of the probability of that response on the 


previous trial. a 


Ah For a discussion of this general class of "incremental" models see 


the chapter by Sternberg in this volume. 


The so-called "linear models" received their first systematic treatment 
by Bush and Mosteller (1951a, 1955) and have been investigated and 
developed further by many others. We shall be concerned only with a 
certain class of such models based on a single learning parameter 6 . 
A more extensive analysis of this class of linear models has been given 
by Estes and Suppes (1959a). 

The linear theory is formulated for the probability of a response 


on trial n+l, given the entire preceding sequence of responses and 
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reinforcements. Let x, be the sequence of responses and 


a In the language of stochastic processes we have a chain of infinite 


order. 





reinforcements of a given subject through trial nj; that is, . Xn is 
a sequence of length 2n with j's (where j=1 to r) -in the odd 
positions indicating eeemonsee and i's (where i= 0 to r) in the 
even positions indicating reinforcements. The axioms of the linear 

model are as follows: for every i,i' and k such that 1<i,i'<r 


and O<k<r, 


Ll. If Pr(E 


elf synftt n*n-1) >O then 


) = (2-8) Pr(a, _| 


Pr(A, |z Lata 


i,nt1 awa n*n-1 ay 4 


Le. If Pr(k 


If infit,n®n-1) > © >» k¢i and k#0 ,° then 


Pry ar lBe Ad nea) = - 6} Pr(a, ab, 1) ; 


L3. If Pr(Ep Ags n*n-1) >O then 


Pr(A, 


inet Bont n®n-v) = Pr(A, 


ayn! ®p-2) 
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By axiom Ll, if the reinforcing event, Ey » corresponding to 
response Ay occurs on trial n, then (regardless of the response 
occurring on trial n.) the probability of A, increases by a linear 
transform of the old value. By Le, if some reinforcing event other 
than Ey occurs on trial n , then the probability of A, decreases 
by a linear transform of its old value. And by L3, occurrence of the 


("neutral") event E. leaves response probabilities unchanged. The 


(e) 


axioms may be written more compactly in terms of the probability, 


Pein? that a subject identified with sequence x makes an A, : 
Eo 


response on trial nj; namely, 


‘Ll. If the subject receives an E, event on trial n, 


i 


Pxt ntl iz (1-O)P es on cia 


2. if the subject receives an E, event (xk #-i and k#0) 


on trial n, 


Pp 


xi,nt+l =e OB en 3 


3. if the subject receives an E event on trial n, 


0 


Peayntl = xin * 
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From a mathematical standpoint it is important to note that for 


the linear model the response probability associated with a particular 
subject is free to vary continuously over the entire interval from 0 
to ] since this probability undergoes linear transformations as a 
result of reinforcement. Consequently, if one wishes to interpret ; 
changes in response probability as transitions among states of a Markov 
process, one must deal with a continuous-state space. Thus the Markov 
interpretation is of little practical value for calculational purposes. 
In stimulus sampling models, response probability is defined in terms 
of the proportion of stimuli conditioned; since the set of stimuli is 
finite, so also.is the set of values taken on by the response proba- 
bility of any individual subject. It is this finite character of 
stimulus sampling adiere that makes possible the extremely useful 
interpretation of the models as finite Markov chains. 
An inspection of the three axioms for the linear model indicates 
that they have the same general form as Equation 60, which describe 
wehaneee in response probability for the fixed sample size component 
ci 


model; that is, if we let 9 = > then the two sets of rules are 


similar. As one might expect from this observation, many of the 
ce 
T . 


For example, in the simple noncontingent situation the: mean learning 


predictions generated by the two models are identical when .@ = 


curve for the linear model is 


Pr(Ay .,) = x - [x - Pr(Ay ,) 2-9)" (72) 
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which is the same as that of the component model (see Estes and Suppes, 
19598, for a derivation of results for the linear model). However, 
the two models are not identical in all respects, as is indicated by a 
comparison of the asymptotic variances of the response distributions. 


For the linear model 
2 ) 
Oe = elo a) BG 


as contrasted to Equation 63 for the component model. However, as 
noted above in connection with Equation 63, in the limit (as N —o. ) 
"the oo for the component model equals the predicted value for the 
Sadwae onan: 

The last result suggests that. the aeupenent model may converge to 
the tates process as N S598 » ‘This conjecture is substantially cor- 
rect; it can be shown that, in the limit both the fixed sample size 
model and the independent sampling model approach the linear model for 
an extremely broad class of assumptions governing the sampling of ele- 
ments. ‘The derivation of the linear model from component models holds 
for any reinforcement schedule, for any finite number r of responses, 
and for every trial n, not simply at asymptote. The proof of this 
convergence theorem is lengthy and we shall not present it here. 
However, as one might expect, the proof depends on the fact that the 
variance of the sampling distribution for any statistic of the trial 
sample approaches 0 as N. becomes ‘large. A proof of the convergence 


theorem is given by Estes and Suppes (1959b). Kemeny and Snell (1957) 
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also have considered the problem but their proof is restricted to the 
‘two-choice noncontingent situation at asymptote. 

, Comparison of the Linear and Pattern Models. The same limiting 
result, of course, does not hold for the pattern model discussed in 

Sec. 3. For the paateen model only one element is sampled on each trial 
and it is obvious that as N—~o the learning effect of this sampling 
scheme would diminish to zero. For experimental situations sities both 


the linear model and the pattern model appear to be applicable it is 


important to derive differential predictions from the two models which, 


on empirical grounds, will permit the researcher to choose between them. 


To this end we display a few predictions for the linear model applied 


to both the RTT situation and the simple two-response noncontingent 
situation; these results will be compared with the corresponding 
equations for the pattern model. 

For phate ciey, let us assume that in the case of the RIT situation 
the likelihood of a correct response i guessing is negligible on all 
trials. Then, according to the linear model, probability of a rein- 


forced response changes in accordance with the equation 


P 


nay = (2-0)R, FO 


In the present application the probability of a correct response 


on the first trial (the R trial) is zero, and hence the probability 


“of a correct response on the first test trial is simply @. No 


reinforcement is given on T, and consequently the probability of a 


1 . 


A. and BE. ~174- 


correct response does not change between TT and Ty » Therefore, 


1 


Poo » the probability of a correct response on both T, and T (as 


1 2 
defined in connection with Equation 55) is ° . Similarly, we obtain 


2 
Po = Pig = o(i1-6) , ana PF (1-6). Some relevant data are 


Insert Table 6 about here 


presented in Table 6 (from Estes, 1961b). They represent joint response 
proportions for ko subjects, each tested on 15 paired associate items of 
the type described in Sec. 2.1, the RTT design applied to each item. In 
order to minimize the probability of correct responses occurring by 
guessing, thee items were introduced (one per trial) into a Aereee 118K, 
the composition of which changed from trial to trial. A eeitdess item 
introduced on etait n fadeiged one reinforcement (paired presentation 
of stimulus and response members) followed by a test (presentation of 
stimulus alone) on trial n and trial n+l, following which it was 
dropped from the list. 

From an inspection of the data column of Table 6 it is obvious 
that the simple linear model cannot handle these proportions. It suf- 


fices to note that the model requires whereas the differ- 


Por = Pip? 
ence between these two entries in the data column tis quite large. 

One might try to preserve the linear model by arguing that the 
pattern of observed results in Table 6 could have arisen as an artifact. 


If, for example, there are differences in difficulty among items (or, 


equivalently, differences in learning rate among subjects), then the 
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Table 6 


Observed Joint Response Proportions for RTT Experiment and Predictions 


from Linear Retention-Loss Model and Sampling Model. 





* Observed Sampling 
‘Proportion Model 
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instances of incorrect response on T. 


1 would predominately represent 


smaller © values than instances of correct responses. On this account 


one might expect that the esi ches tes etl Gane correct following 
incorrect responses would be smaller than that allowed for under the 
"equal 0" assumption, and therefore that the linear model might not 
actually be incompatible with the data of Table 6. We can easily check 
the validity of such an argument. Suppose that parameter 8, is asso- 
__ elated with a proportion fy of the items (or subjects). Then in each 
case meds: 85 is applicable, the probability of a correct response on 


T, followed by an error on T, is 6, (1-6, ) - Clearly then, Poy 


estimated from a group of items described by differences in 6 would be 


Po. = = £,0,(1-0;) - 
But a similar argument yields 
Pio a f,(1-0,)0, - 


Since, again, the expressions for Pig and Poy are equal for all 
distributions of 8, y itis clear that individual differences in 
learning rates alone eoula aot account for the observed results. 

A related hypothesis that might seem to merit consideration is 
that of individual differences in rates of forgetting. Since the pro- 
a .? there 


is evidently some retention loss, and differences among subjects, or 


portion of correct responses on T, is less than that on T 


items in susceptibility to this retention loss might be a source of 
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bias in the data. ‘The hypothesis can be formulated in the linear model 


1 is equal to 


03 if, however, there is a retention loss then the probability of a 


as follows: Probability of the correct response on 


correct response on qT, will have declined to some value p , such 

: “that ‘p <oe. - If there are individual differences in amount of > 
retention toe. hen we should again categorize the population of 

" subjects and items into subgroups, with a proportion af of the sub- 
“jects ghavacterized by retention parameter p i’ Theoretical expres- 
sions for Pa, can be derived for such a population by the same. method 


used in the preceding case; the results are as follows: 
Pog = @ 2p, ; 


Po. = 62_#,(1-p4) 


Pio (0) 2. £5P5 


Pil 


(1-0) > f,(2 eps 


This time the expressions for Pig and Po, ere different; with a 
suitable choice of parameter values, they could accommodate the dif- 
ference between the observed proportions Poy and Pig + However, 
another difficulty remains. To obtain a near zero value for Pio 
would require either a @ near unity, which would be incompatible with 


the observed proportion of .385 correct on T. ‘or a value of 


1? 


; > f5P, near zero, which would be incompatible with the observed 
= 7 : 
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proportion of .255 correct on Ty, » Thus, we have no support for the 
hypothesis that individual differences in amount of retention loss might 
account for the pattern of empirical values. | 

One can go on in a similar fashion and examine the results of 
supplementing the original linear model by hypotheses involving more 
complex combinations or interactions of possible sources of bias (eae 
Estes, 1961b). For example, one might assume that there are large 
individual differences in both learning and retention parameters. But 
even with this latitude it is not easy to adjust the linear model to 
the RTT data. Suppose that we admit different learning parameters, 


@ and Oy » and different retention parameters, 


L and Pp » the 


Py 
combination 9,01 3 obtaining for half the items and the combination 


9525 for the other haif. Now the Py; formulas become 

Pa, Sake 

Poo = 5 —— Of 
© Oy, (1 -py) + 8301 ~ eg) 

Pa 8 , 

_. _ -81)o, + (1 &)0 

ote 2 rs 
_ (1=8,)(2~ py) + (L=%)(2 ~ 09) 

Pa 7 5 ; 


From the data column of Table 6, the, proportions of correct responses 
on the first and second test trials, are Po = .385 and Plo = 2255 » 
respectively... Adding the':first and second of the equations above to 
obtain the.theoretical expression for Pow? and the first and third 


equations to get Plo > Wwe-have: 
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_ 8 + 4 
J: ee oy 
and 
_ ®y * Pe 
Po= 3s, 


Equating theoretical and observed values, we obtain the constraints 
6, + @, = .T70 
Lan + Po = .510 , 


which should be satisfied by the parameter values. If the proportion 


Poo in Table 6 is to be predicted correctly, we must have further 


910, * 859 _ :33g 
Se 


> 
or, substituting from the two equations just above, 
8,0, + (.77- 8,)(.51- py) =.-476 


which may be solved for 8, 2 


.083 + -T7Py 


a) ~ 29, = 51 * 





Ma 


Now the admissible range of parameter values can be further reduced. 
For the right hand side of this last equation to have a value between 


O and 1, Py must be greater than. .48., so we have the relatively 
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proportion of .255 correct on ts" Thus, we have no support for the 
hypothesis that individual differences in amount of retention loss might 
account for the pattern of empirical values. . 

One can go on in a similar daewicn and examine the results of 
supplementing the original linear model by hypotheses involving more 
complex combinations or interactions of possible sources of bias (see 
Estes, 1961b). For example, one might assume that there are large 
individual differences in both Aneel and retention parameters. But 
even with this latitude it is.-not easy to adjust the linear model to 
the RIT data. Suppose that we admit different learning parameters, 


6 and @ 


1 3 > and different retention parameters, 


Py and Po > the 


combination” @ obtaining for half the items and the combination 


iPa? 
9505 for the other haif. Now ‘the Py; formulas become 


@ 








alas oe 
Poo al 3 3 

_ 9, (1 py) + 92(2 -,) 
Po. ~ 2 , 
_ (1=9))0y + (1+ 8p )05 
Pio ~ 2 ar 


(1 =0, (2+ 9.) + (L= @))(1 ~ Pp) 





From the data column of Table 6, the proportions of correct responses 
on the first and second test trials, are Po = 2385 and Po= #2555 
respectively.. Adding the first. and second of the equations above to 
obtain the theoretical expression for Po? and the first and third , 


equations :to get.. Po > wWe-have 
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_ 8, + 8% 
Po. a 2 


i. ne 


and 
i 
Pio =—B 


Equating theoretical and observed values, we obtain the constraints 


6) + 85 = et (OD 


P, + Pp = 520 , 


which should be satisfied by the parameter values. If the proportion 


Poo in Table 6 is to be predicted correctly, we must have further 


Co] + 6 


1P1* %2Pe = 1238, 
B 


or, substituting from the two equations just above, 
0,0, + (.77- 0,)(.51- e) =.-476 
which may be solved for 8 : 


083 + “TTP, 


a et 
1 2p, bl 


Now the admissible range of parameter values can be further reduced. 
For. the right hand side of this last equation to have a value between 


Oo and il, ey must be greater than .48., so we have the relatively 


A. and E. -179- 
narrow bounds on the parameters Ps 


ohB <p, < .51 


Po < 0 .- 

Using these bounds on P, > We find from the equation expressing 8, 
as.a function of ey that a, must in turn satisfy .93.< 8 < 1.0. 
But now the model is in trouble, for in order to also satisfy the 
constraint 8 + 85 = olT 5 85 would have to be negative (and.the cor- 


rect response probabilities for half of the items on qT, would also be 
negative). About the best we can do, without allowing "negative proba- 
bilities" is to use the limits we have obtained for P12 Po » and 9 
and arbitrarily assign a zero or small positive value to 85 » Choosing 
the combination 8) = 95, 9, = .Ol, Py = o5 , and Py = 2OL , we 
obtain the theoretical values listed for the linear model in Table 6, 
By introducing additional assumptions or additional parameters, we could 
improve the fit of the linear model to these data, but there would seem 
to be little point in doing so. The refractoriness of the data to des- 
eription by any reasonably simple form of the model suggests that perhaps 
the learning process is simply not well represented by the type of growth 
' function embodied in the linear model. 

By contrast, these data can be quite readily handled by the stimulus 

fluctuation model developed in the preceding section. Letting f) =1 


1 


in Equations 70, and using the estimates c= .39 and f, = 261 >» we 


2 


obtain the theoretical values listed under "Sampling Model" in Table 6. 
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One would not, of course, claim that the. sampling model has been rigorously 
tested, since two parameters had oe estimated and there are only three 
degrees of freedom in this uae of data, However, the model does seem 
more promising than any of the variants of the linear model that have 
been investigated. More stringent tests of the sampling model can readily 
be obtained by running similar experiments with longer sequences of test 
trials, since predictions concerning joint response proportions over 
blocks of three or. more test trials can be generated without additional 
assumptions. 

‘Additional Comparisons Between the Linear and Pattern Model. We 
now turn to a few comparisons between the linear model and the multi- 
element pattern model for the simple noncontingent situation. First of 
‘all, we note that the mean learning curves for the two models (as given 

c 


in Equation 37 and Equation 72) are identical if we let 1 6. 


However, the expressions for the variance of the asymptotic response 


2 A 
distribution are different; for the linear model oo) = n(1- .1)5 : 6°? 





whereas for the pattern model on, = x(1 -0)e » This difference is 
reflected in another prediction that provides a more direct experimental 
test of the two models. This concerns the asymptotic variance of the 
distribution of the number of A) responses in a block of K trials 
which we. denote var(A,) . For the linear model (cf. Estes and-Suppes, 
_1959a), 


var(B) = «(2-n){ Sb=78) . TS | (2-0) 


sa 
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for the pattern model, by Bq. 242, 


var(R,) = x(.-2) (x + Mbae) SR Se. 2-68 


fone 


Note that, for c=@, the variance for the pattern model is larger 
than for the linear model. However, for the case of @ = - » the 
variance for the pattern model can be larger or smaller than for the 
linear model depending on the particular values of c and N. 
Finally, we present certain asymptotic sequential predictions for 


the linear model in the noncontingent situation; namely 





Lim Pr(Ay nel Ey nAtn) = (1-@)a+@ 
a Phy aan Botta) = (1-6)a 

lim Pr(Ay ad Ey tes) =1- (L-@)b 
lin Pr(Ay ned Ee nfe,n) = (1-6)(1-»b) 


where a = [2n(1-6@) + 6] /(2-6) and b= [2(1-x)(1-6) + 0]/(2-8). 
These predictions are to be compared with Eq. 34 for the pattern model. 
In the case of the pattern model we note that Pr(A, [E,A,) and 

Pr(A, |EpA5) depend oni on a and WN whereas Pr(A, |E5A,) and 

Pr(A, |B, Ap). depend on mt, N and ec. In contrast, all four sequential 
probabilities depend on nm and © in the Linear model. For detailed 


comparisons between the linear model and the pattern model in application 
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to two-choice data, the reader is referred to Suppes and Atkinson (1960), 





and Estes and Suppes (1962). 


5.4 Applications to Mul ti-person Tatevecuions 

In this seetion.we apply the linear model to. experimental situations 
involving multi-person interactions in which the reinforcement for any . 
given subject depends both on his response and on the responses of other 
subjects. Several recent investigations have provided evidence indi- 
cating the fruitfulness of this line of development. For example, Bush 
and Mosteller (1955) have analyzed a study of imitative behavior in 
terms of their Lites: model, and Estes (1957a), Burke (1959}1960). and’ Atkinson 
and Suppes (1958) have derived and tested predictions from linear models 
for behavior in two and three person isting Suppes and Atkinson (1960) oe j 
have also provided a seapleteat between pattern honed and linear models 
for multi-person experiments and have extended the analysis to situations 
involving sienna than between subjects, ncaatany payoff, social pres- 
sure, economic oligopolies, and related variables. 

The simple two-person game has particular advantages for expository 
purposes, and we use this situation to tllustrate the technique of 


extending the linear model to multi-person interactions. We consider 


a situation.which, from the standpoint of game theory (see, e.g., Luce 





and Raiffa, 1957), may be characterized as a game in normal form with 
a finite number of strategies available to each player. Each play of 
the game constitutes a trial, and a player.'s choice of a strategy for 


a given trial corresponds to the selection of a response. To avoid 
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problems having to do with.the measurement of utility. (or from the 
viewpoint of learning theory, problems of reward magnitude), we assume 
a unit ‘reward that is assigned on. an all-or-none basis. Rules of the 
game require the two players to exhibit their choices simultaneously 

on all trials (as in a game of matching pennies): and each player is 
informed that, given the cHoice of the other player on the trial, there 
is exactly one choice leading to the unit reward. 

We designate the two players as A and B and let ASA eljieesr) 
and BY (I= l,...,r') denote the responses available to the two players. 
The set of reinforcement probabilities prescribed by the experimenter 
may be represented ina matrix (a, 5 by 5) ‘analogous to the “payoff 
matrix" familiar in game theory. The number Bay represents the’ pro- 
bability of .Player A being correct on any trial ‘of the experiment 
given the response pair A,B, 3 similarly, Day is ‘the probability of 
Player B being correct. given the. response pair. A,B, » . For example, 


consider the matrix 





When both subjects make response 1 , each has probability s of 
receiving reward; when both make response 2, then only Player B 


receives reward; when either of the other possible response pairs occurs 


(ives; ASB, | or “A, By) , ‘then only’ Player'A receives reward. It 
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should be emphasized that although one usually thinks of one player 
winning and the other losing-on any given play of a game, this is not 

@ necessary restriction on the model. In theory, and in experimental 
tests of the theory, it is quite possible to permit both or neither of 

the players to be rewarded on any trial. However, to provide a rela- 
tively simple theoretical interpretation of reinforcing events it is 
essential that on a nonrewarded trial the player be informed (or led to 
infer) that some other choice, had he made it under the same circum" . a 
stances, would have been successful. We return to this point later. . 


Let gl) denote the event of reinforcing the A, response for 


i 


Player A and 2{) the event of reinforcing the B,. response for 


J 
Player B . To simplify our analysis we consider the case in which 
each subject has only two response alternatives, and we define the 


probability of occurrence of a particular reinforcing event:initemms of 


the payoff parameters as follows (for if#i' and j#jt): 


= pr (a4) |A 


) 


yt sa 
i,nj,n ij 


o 
i 


Bs j = rr(zO) La, B, ) 


i,n’j,n 
(73) 
L-a,,= pr(a() yi 


ij |A, _B, .) 


ijn jy aj 


Pr 
' 
o 
i 


= Pr(g{2)[a, : 
j' ‘i,n'§,n 


For example, if Player A makes an A, response and is rewarded then 


1 
an al) occurs; however, if an AL is made and no. reward occurs then 
we assume that the other response is reinforced; i.e., an BA) occurs + 


Finally, one last definition to simplify notation. We denote 
Player A's response probability by Q and Player B's by By, and we 


denote by y the joint probability of an Ay and BL response. Specifically, 
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@. = Prlay a)» By = PRB A) oo 7 = Pray By) (14) 


...We now derive a theorem that provides recursive expressions for .. 
a, and Bo and points up a property of the model that greatly complicates 


the mathematics; namely, that both @ 


Lal and Boel depend on the joint 


probability 7, = Pr(A, yin Jos 


Theorem 


Qiey = (1 - Og(2- a) 07 App) IO, + Onenn~ Spy )B, 
(75a) 
+ 0, (814 $85) ~ Ayo B55)7%,. + Og(L~ Ay9) 
Baer = EL - @ (2 Boy7 Pop) IB, + On(Don- by), 
(75b) 


+ - = 
+ @,(by 1 byp™ Poy” Pop)%_ + Og(1- Pop) 


where On and 8, are the learning parameters for players .A and B. 


Proof. It will suffice to derive the difference equation for a +1? 


since the derivation for Boa is identical. To begin with, from 





Axioms Ll and Le we can easily show that the general form of a recursion 


for a, is 


is = (1 -@ rca + o,Pr(zi*)) : 


‘The term pr(B*) ) can then be expanded as follows 
2 
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pe(el®) ) “Seth A, 2B 4, ") 
sal Ay mB jyn) PPCAY, mB, 0) 
and by (73) 
pr(Bl®)) = Ay Pr(A By) + AyPr(Ay Bo a) 
+ (1-89) Pray By y) + G~ age) Pr(Ay Bo nd a 
Next we observe that 
Pr(Ay Bo 4) = Pr(By ylAy ,)Pr(A) 2) 
= [2 - Pr(B, nl ayn) JPr(A, ) (77a) 
= Pr(A; ,) - Pr(A, By.) 
Similarly 
Pr(ty Pin) = Pr(B, ,) - Pr(Ay By)» (77>) 
and 
Pr(Ay By ,) = Pray ,1Bo y)Pr(Bo a) 
ae lee Pr(Ay |B ,)JPr(B, ,) 
(77e) 


Pr(B,q) - Feld, Bo, a) 


1- Fr(B, ) - Pr(A, | » + Pr({A 


1m 1, n? 
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Substituting into Equation 76 from Equations 77a, 77b and 77c and 


simplifying by means of the definition for @, 8 and y, we obtain 


m(e{A)) B17%n * Ayal ~%) + (- a5,)(8, 7%) 


oF (1- Ap) (1- id -B, +7) 


= Gr aya Aap)d, + (aa9 - Anq)8, 


+ (844 +851 ~ 849 — 820) %, + (1-855) . 


Substituting this expression into the general recursion for Q yields 
the desired result, which completes the proof. 

It has been shown by Lamperti and Suppes (1959) that the limits 
@, 6B and y exist, whence (letting ey = Hats Boal = B. = 6B 
and Ys = y in 75a and 75b.) we have two linear relations that are 


independent of 6, and 8, > © namely, 








A 
aw@= bB+eyt+d , eB=f@+gyth , (78) 
where 

a= 2 - a5 - 855 @ = 2 - bey - bop | 

b= 8p - 8) f= Bop ~ Pip 
(79) 

C= Ay) + 85) - B15 - Bap B= Pi, t Pip - bay 7 Pop 

G@= 1 apo h=l- boo ° 
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By éliminating y from Equations 78 we obtain. the following linear 


relation in @ and 6: 
(-ag-ce)a+ (bg+cf)B = ch ~ dg- (80) 


Unfortunately, this relationship is one of the few quantitative 
results that can be directly computed for the linear model. It has, 
however, the advantageous cake that it is independent of the learn- 
ing parameters. @ Nn and. 6, and therefore may be compared directly with 
experimental data. Application of this result can be illustrated in 


terms of the game cited earlier in which the payort matrix takes the form 


B,. : By : 
ze 

AL Bo 1, 0 

Ao 1, 0 Oo, L . 


From Equations 79 we obtain 


L iL 
a= 1 e=5 eel g--5 
b= -1 d=i fel h= 0 


and Equation 80 becomes 


or B= 


ir 
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From this result we predict immediately that the long-run proportion 
of By responses will tend to - » To derive a prediction for Player A ,; 


_we substitute the known values of the parameters into the first part 


| 
H 
1 


of Eq. 78 to obtain 


Il 
NIE 
+ 
noe 

SR 





Unfortunately we cannot compute y, the asymptotic-probability of the 


A,B, response pair. However, we know y is positive and since only 
ER 


3 of Player B's responses are B)'s > %% cannot be greater than s 2 


Therefore, we have O<7< 3S and as a result can set definite bounds 


tH 


on the long-run probability of an A 


4 response; namely 


Ses 


Vie 
Nir 
+ 
nie 
° 
Ne 
Ul 
I 


Thus, we have the basis for a rather exacting experimental test since 
the asymptotic predictions for beth subjects are parameter-free; i.e., 
they do not depend on the @-values of either subject or on initial 
response probabilities. 

Of course, by imposing restrictions on the experimentally determined 
parameters a, j and Bes a variety of results can be obtained. We 7 
limit ourselves to the consideration of one such case: choice of the i 
parameters so that the coefficients of V5, vanish in the eadseiae 
equations 75a.and .75b. Specifically, if we let-c=g=.0 and 


af - be #0, then 
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a 
n+l 


tt 


a@ +bpB +4 
n n 
(82) 


Boa = eB, + fa +h ‘ 


Solutions for this system are well-known and can be obtained by a number 
of different techniques; for a detailed discussion of the problem of 
obtaining explicit expressions of o and Bo for arbitrary n the 
reader is referred to an article by Burke (1960). We do know, however, 


that the limits for a and BY exist and are independent of both 


the initial conditions and aN and 8, > By substituting Ham =m, 
and B = Boa 7 Bo into the two recursions we obtein 
_ bh + af 
~ af - be 
and 
B = ah + de 
“af - be ° 


The fact that @ and 8B are independent of @ a and 85 under the 
restrictions imposed on the parameters in no way implies that y is 
also independent of these quantities. 

Eq. 81 provide a very precise test of the model and the necessary 
conditions for this test involve only experimentally manipulable param- 
eters. A great deal of experimental work has been conducted on this 
restricted problem and, in general, the correspondence between predicted 
and observed values has been very good; for an account of this work see 
Atkinson and Suppes (1958), Burke (1959, 1960), and Suppes and Atkinson 


(1960). 
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In conclusion we should. mention that -all of the predictions 


presented in this section are identical to those that can be derived 





from the pattern model of Section 2 . However, in general, only the 
grosser predictions, such as those for a. and BY , . are the same 


for the two models. 
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6. DISCRIMINATION LEARNING 


The distinction between simple learning and discrimination 
learning is somewhat arbitrary. By discrimination we refer, roughly 
speaking, to the process by which the subject learns to make one re- 
sponse to one of a pair of stimuli and.a different response to the 
other. But there is an element of discrimination in any learning 
situation, Even in the simplest conditioning experiment, the subject: 
learns to make a conditioned response only when the conditioned stimu- 
lus is presented, and therefore to do something else when that stimulus 
is absent. In the suived aeebolate situation (referred to several times 
in previous. sections) the subject learns to associate the appropriate 
member of a hacpcniadet with each member of a set of stimuli, and 
therefore to "aiscriminate" the stimuli. The principal basis for 
differentiation between the two categories of learning seems to be that 
in the case of discrimination learning the similarity, or communality, 
between stimuli is a major independent variable; in the case of simple 
learning, stimulus similarity is an extraneous factor, to be minimized 
experimentally and neglected in theory so far as possible. 

One of the general.-strategic assumptions of the type of stimulus- 
response theory which has been associated with the development of 

' stimulus sampling models is that discrimination learning involves sim- 
ply a combination of processes each of which can be studied independently 


in simpler situations -- the learning aspect in experiments on simple 
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acquisition or extinction, and the stimulus relationships in experiments 
on stimulus generalization or transfer of training. Thus, there will be 
nothing new at the conceptual level in our treatment of discrimination. 
There is adequate scope for analysis of different types.of discrimina- 
tive situations;.but since our main concern in this article is with 
methods rather than content, we shall not go far in this direction. 

We propose only to show how the processes of association and generali- 
zation treated in preceding sections enter into discrimination learning, 
and this can be accomplished by formulating assumptions and deriving 


results of general interest for.a few important cases. 


bal The Pattern Model for GisGriatnatted Learning 

As in the cases of eile deena ctiten ana probability learning, 
it is sometimes useful in the iecutneat of discriminative situations 
to ignore generalization SPreetE among the stimli involved in an exper- 
iment and regard each ee display as a unique pattern, Thus, 
behavior elicited by the aeiewitin display will depend only on the sub- 
jectls reinforcement history with eaneee to that particular pattern. 
‘Two important variants of the apd arise according as experimental 
eee ere do or do not ensure that the subject will sample the entire 
pede saeplay presented on each ial, . 

Case 1. All cues presented are sampled on each trial. For a 
classical two-stimlus, two-response discrimination problem (e. Be, 8 
Lashley situation with the fee oars aifferentially rewarded for ee 


to a black card ane Wectdaue a grey card), our conceptualization requires 
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be 


set of component cues present only in the stimulus situation associated 


a distinction among three types of cues: We shall denote by 5, the 


‘with reinforcement of response AL 5 by & the set of cues present 


2 
only in the situation associated with reinforcement of response Ap r) 
and by §, » the set of cues present in both situations. In the exam- 
ple of the Lashley situation, AL might be the response of jumping to 
the left hand window, Ay » the response of jumping to the right hand 
window, 8) the stimulation present only on trials with black cards, 
55 the stimulation présent only on trials with grey cards, and 8, 
the stimulation common to both types of trials. And we denote by Ny ’ 
Np » and XN, » the number of cues in each of these subsets. In stan- 
‘dard experiments, the "cues" refer to experimentally manipulable aspects 
of the situation, such as tones, objects, colors, sibel; “oe the like, 
and it is reasonably well-known just how many different combinations of 
these cues will be responded to by subjects as distinct patterns: ‘In 
some instances, however, the experimenter may have no a priori knowledge 
as to the patterns distinguishable by the subject; in such instances, 
the Ny may be treated as unknown parameters to be estimated from data, 
and the model may thus serve as a tool to aid in securing evidence as to 
the subject's perceptions of the physical situation. 

Suppose, now, that the experimenter's procedure is to present on 
some trials ( tT trials) a set of cues including my from Sand m, 


from Ss, 3 and on the remaining trials ( Tp trials) M, cues from Sp 


and m, from 8, » Further, let the two types of trials occur with equal 
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frequencies in random sequence. On trials of type. T. 


1? 
™ Ny 
» be es \( different patterns of cues available. Assuming that 
L e Noy ie 
1 | ce ; 
iN™e 


these patterns are all equally probable, and letting. b, = | 
we can obtain an expression for probability of a correct response on a 


there will 











le m 


T trial simply by appropriate substitution into Eg. 28, viz 


ni-1 
1 
Petty iy. |B.) =l-{1- Pr(Ay i/Ty 3)1 (1 - cb, .) , (82) 


where ay is the ordinal number of the Ty trial. The corresponding 


function for Ty trials is obtained similarly with parameter 


Ny\ {5 au 
c 
Pog = m 5 
May \ Me 
in. the discrimination literature, cues in the sets 5) and 85 


are commonly referred to as relevant, those in’ 5. as irrelevant, since 
the former are correlated with reinforcing events whereas the latter 

are not. It is apparent by inspection of Eq. 82 that (for the above speci- 
fied experimental conditions): the pattern model predicts that probability 
of correct responding will go asymptotically to unity regardless of the 
numbers of relevant and irrelevant cues, provided only that neither 

my nor m5 As equal to zero. Rate of approach to asymptote on each 

ree of trial is inversely related to the total number of patterns - 
available for sampling. Therefore, other things equal, rate of learning 

is decreased (and total errors to criterion increased) by the addition 


of either. relevant or irrelevant cues. 
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Case 2. Partial sampling of the cues presented on each trial. 

‘ We consider now the situation which arises if the number of cues pre- 
sented per trial is too large, or the exposure time too short, for the 
entire stimulus display to be sampled by the subject. Let us suppose 
for simplicity that there are only two stimulus displays: the display 
on. T, trials comprises the Ny cues of Ss, together with the Ny 


1 


cues of 8, » and that on Ts 


with the Ny cues of 8, » For a given fixed exposure time, we assume 


trials the Ny cues of 85 together 


a fixed sample size s , with all samples of exactly s cues being 
N N 

alt e 

8, ]\ 8-5) 

cues fron S, and the remainder from 


equiprobable. On T trials there will, then, be 


of filling the sample, with s 


ways 


1 

8, » The asymptote of discriminative performance will depend on the 

size of s relative to Ny: Tf s<N,» so that the entire sample 

can come from the set of irrelevant cues, then the asymptotic probability 
of a correct response will be less than unity. 

In Case 2, two types of patterns need tc be distinguished for each 
type of trial. We can limit consideration to Ty trials, since analo-~ 
gous arguments hold for Tp » There may be some patterns including 
only cues from 8, » and learning with respect to these will be on a 


simple random reinforcement schedule. The proportion of such patterns, 


Woo is given by. 
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which is equal to zero if s> No: EE qT and qT, trials have equal 


probabilities, then the probability, to be denoted V9 that a pattern 


containing only cues from 8. will be conditioned to the A response 
on trial n can be obtained from Eq. 28 by setting Mio = Moy = 5 

c We Ny : Ny a 
Pr(A, .) = Vy» and 7 = ¥, = cb,,, where b,, = P y tee, 

s 
1 1 n-1l 
eee Cee . 8 
v= 3- - Vy) - &,,) (85) 


i 
i 
i 


The remaining patterns available on Tt trials all contain at least one 
cue from Sy, and thus occur only on trials when response AL is re- 

inforced. The probability, to be denoted UL» that any one of these is 
conditioned to AL on trial n may be similarly obtained by rewriting 
Eq. 28, this time with x 


Bod : 
Wo 2 Pie i.e., 


qo = %, = L Pr(A, ) =U), and 


= By n-1 
U,ei- (1 - UL) (1 -5 cb, ,) 5 (84) 
where the factor $ enters because these patterns are available for 
sampling on only * of the trials. 


Now to obtain the probability of an A, response ifa T, display 


1 uD 


is presented on trial n, we need only combine Eq. 83 and 84, weighting 


each by the probability of the appropriate type of pattern, viz 
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Pr(Ay IT) a) = (1- wu, + wv, 
-l-¥, +z Le (1-w)(2-u)(0-3 ne i 
i n-1l' 
~ w% -V,)(L+ eb, ,) (85) 


which may be simplified, if U,=V,=%, to 


1 


1 1 n-L 
Pr(A = 15 w, - B(1-w)(1- 3 eb (85a) 


1n!Tn) re) 


The resulting expression for probability of a correct response 
eee a number of interesting general properties. The asymptote, as 
anticipated, depends in a simple way on_ Wo» the proportion of 
“irrelevant patterns". . When w= O, the asymptotic probability of 
a correct response is unity; when Wo 1; the whole process reduces 
to simple random reinforcement. Between these extremes, asymptotic 
performance varies inversely with Wo» 80 that the terminal proportion 
of correct responses on either type of trial provides a simple estimate 
of this parameter from data. The slope parameter, chy, » could then 
be estimated from total errors over a series of trials. As in Case 1, 
the rate of approach to asymptote proves to depend only on the condi- 


tioning parameters and total number of patterns available for sampling; 


thus it is a joint function of the total number of cues, Ny +N, » and 
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the sample size, s , but does not depend on the relative proportions 
of relevant and irrelevant cues. The last result may seem implausable, 
but it should be noted that the result depends on the simplifying assump- 
tion of the pattern model that there are no transfer effects from learn- 
ing on one pattern to performance on another pattern which — component 
cues in common with the first. The situation in this regard will be 


different for the "mixed model" to be discussed below. 


6,2 A Mixed Model 

The pattern model may provide a relatively complete account of 
discrimination data in situations involving only distinct, readily dis- 
criminable patterns of stimulation, as, for example the “paired comparison" 
experiment discussed in Sec. 3.3 or the verbal discrimination experiment 
treated by Bower (1962). Also, this model may account for some aspects of 
the data (e.g-, asymptotic performance level, trials .to criterion) even in 
discrimination experiments where similarity, or communality, among stimuli 
‘is a major variable. But to account for other aspects of the data in cases 
of the latter type, it is necessary to deal with transfer effects through- 
out the course of learning, The approach to this problem which we now 
wish to consider employs no new conceptual apparatus, but simply a com- 
bination of ideas developed in preceding sections. 

In the mixed model, the conceptualization of the discriminative 
situation and the learning assumptions are exactly the same as those of 


the pattern model discussed in Sec. 6.1. The only change is. in the 
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vosnouse rule, and that is altered in only one respect. As before, we 
assume that, once a stimulus pattern has become conditioned to a response, 
it will evoke that response on each subsequent occurrence (unless on 

some later trial the pattern becomes reconditioned to a different re- 
sponse - as may occur during reversal of a discrimination). The new 
feature concerns patterns which have not yet become conditioned to any 
of the response alternatives of the given experimental SiGabton: but 
which have component cues in common with other patterns which have been 
so conditioned. Our assumption is simply that transfer occurs from a 
conditioned to an unconditioned pattern in accordance with the assump- 
tions utilized in our earlier treatment of compounding and generalization 
(specifically, by axiom C2, together with a modified version of Cl, of 
Sec. 4.1). 

' Before the assumptions about transfer can be employed unanibiguously 
in connection seit the mixed model, the notion of conditioned status of 
a component cue needs to be clarified. We shall say that a cue is con- 
ditioned to response A, if it is a component of a stimulus pattern 
that has become eudaelonda to response A; - If a cue belongs to two 
patterns, one of which is conditioned to response A, and one to ren. 
sponse Ay (i743) , then the conditioning status of the cue follows 
that of the more recently conditioned pattern. Ifa cue belongs to no 
conditioned pattern, then it is said to be in the unconditioned, or 
"guessing" state. Note that a pattern may be unconditioned even though 


all’ of its cues are conditioned. Suppose, for example, that a pattern 
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consisting of cues x5 and z ina particular arrangement has never 
been presented during the first n triahs of an experiment, but that 
each of the cues has appeared in other patterns, say wxy and wvz , 
which have been presented and conditioned. Then all of the cues of pat- 
tern xyz would be conditioned, but the pattern would still be in the 
unconditioned state. Consequently, if wxy had been conditioned to 


response A, and wvz to Ay » the probability of Ay in the presence 


uy 
of pattern xyz would be 3 - But if now response AL were effectively 
reinforced in the presence of xyz , its probability of evocation by 
that pattern would henceforth be unity. 

The only new complication arises if an unconditioned pattern 
includes some cues which are still in the unconditioned state. Several 
alternative ways of formulating the response rule for this case have 
some plausibility, and it is by no means sure that any one choice will 
prove to hold for all types of situations. We shall here limit consid- 
eration to the Pica oie suggested by a recent study of discrimination 
and transfer which has been analyzed in terms of the mixed model (Estes, 
and Hopkins, 1961). The amended response rule for patterns including 
unconditioned cues is as follows in this formulation; Axiom C2 of 
Sec. 4.1 is reinterpreted so that in a situation involving r response 


alternatives, 


Fah if all cues in a pattern are unconditioned, the probability 


of any response A, is equal to = 3 
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2. if a pattern (sample) comprises m cues conditioned to 
response AY » m!' cues conditioned to other responses, and 
m" unconditioned cues, then the probability that A, will 


be evoked by this pattern is given by 


in" 
mt+— 
xr 


m+m' + m" 


Pr(A,) = ‘ 
In other words, axiom C2 holds, but with each unconditioned cue 
eontributing "weight" 2 toward the evocation of each of the alternative 
responses. 

To illustrate these assumptions in operation, let us consider a 
simple classical discrimination experiment involving three cues, a, 

_&b » and ce, and two responses, AL and Ay « We shall assume that 
the pattern ac. is presented on half of the trials, with Ay reinforced, 
and be on the other half of the trials, with A reinforced, the two 
types of trials occurring in random sequence. . We assume further that 
conditions are such as to ensure the subject's sampling both cues pre- 
sented on each trial. . The possible conditioning states of each pattern 

. and. the probability of response AL associated with each may now be 


tabulated as follows: 
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A, Probability 


1 

: States to each Pattern 
Bc be ec be 
1 2 1 fe) 
1 1: RE iL 
2 2 ie} 0) 
2 1 0 L 
) 1 3/4 1 
0) 2 1/4 ) 
1 0 ae 3/4 
2 0 0 1/4 
10) 0) 


V2 1/2 


wherea 1,2, or O, ‘respectively, . in a State column indicates 


that the pattern is conditioned to A conditioned to Ay >» or 


L $ 
unconditioned. For each pair of values under States, the associated 
Ay probabilities, computed according to the modified response rule, are 


given in the corresponding positions under A, Probability. To reduce 


a 
algebraic complications, we shall carry out derivations for the special 
case in which the subject starts the experiment with both patterns 
unconditioned; then, under the conditions of reinforcement specified 
above, only the states represented in the first, seventh, sixth, and 
ninth rows of the tables ‘are available to the subject, and for brevity 


we shall number these states 3, 2,1, and 0, in the order just 


listed. That is, 


A. and BE. ~204- 


State 3 = pattern ac conditioned to Ay » and pattern be 
conditoned to A ¥ 





State 2 = pattern ac conditioned to AL » and pattern be 


unconditioned, 





State 1 = pattern ac unconditioned, and pattern be conditioned | 
to Ay y 


State O = both patterns ac and be are unconditioned. 


Now these states can be interpreted as the states. of a Markov 





chain, since the probability of transition from any one of them to any 





other on a given trial is independent of the preceding history. The 


i 
} 


matrix of probabilities for one-step transitions among the four states 


takes the following form: 


| 
| 
| 
| 





7 ) ) ) 
& 
5 (1-3 ) o 
Q= , (86) 
£ ) er 0 
2 2 
Cc Cc : 
r@) 3 D Live 





where the states are ordered 3, 2,1, 0 from top to bottom and 
left to right. Thus, State 3 (in which ac is conditioned to Ay ’ 
and be to Ag ) is an ebsorbing state, and the process must termi- 


nate in this state, with asymptotic probability of a correct response 


to each pattern equal to unity. In State 2, ae is conditioned to 





Ay but be is still unconditioned. This state can be. reached only 
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from State 0, in which both patterns are unconditioned; the probability 
of the transition is z ‘(the probability that pattern ac is presented) 
times c (the probability that the reinforcing event produces condi- 


tioning); thus the entry in the second cell of the bottom row is 5 ° 
From State 2, the subject can go only to State 3, and this transition 


again has probability 5 - ‘The other cells are filied in similarly. 


Now the probability, Un? of being in state’ i on trial n 
3 

can-be derivéd quite easily for each state. The subject is assumed to 

start the experiment in State O and has probability c of leaving this 


state on each trial, hence 


Yon = (Ls 


° 


er 


For State 1, we can write a recursion, 


“2c 


U. Die 


ayn EB Bt A D712 0) + vv + GL -e)? 
which holds if n>2. For, to be. in State 1 on trial n, the 
subject must have entered at the end-of trial 1, which has proba- 
pilaty is sy and then remained for n-2 trials, which has probability 
(1-8)?* 3 or have entered at the end-of trial 2 , which has proba- 
bility (l- es » and then remained for n-3 triais, which has 
probability (.-g"? 3 °° 3 or have entered at the end of trial 
n-2 ¢ 


n-1, which has probability (1-c) 5° The right hand side of 





this recursion can be summed to yield 
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n-2 ee is i 


c n=2 2 
in 7 Bit) =. T= e 





ford 
iT 


n-l}{ 2-¢ \at 


B(1l-c) 


(1-c) -i 


a 


(a- $y" = (a - ey 
By an identical argument, we abtain 


, eyaed x oe nel 
Upon = (b¥ 8) (1 =e) , 


and then by subtraction 


Ugyn ~ ae Yon? Yan 7 Yon 


1-2-8) ta (ec)? 5 


From the tabulation of sedate and response probabilities, we know 
that the probability of response AL to pattern ac is equalto 1, 
1, F » and $ » respectively, when the subject is in State3,2,1, | 
-or 0. Consequently the probability of a correct (A,) response to 
‘ac is obtained simply by summing these response probabilities, each | 


weighted by the state probability, viz 
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Pr(A, lac) s 


qi 
i 
wt 
. 
5 
4 
we 
3 
+ 
Fle 
Fad 
Hu 
us 
5 
+ 
Dy 
Cc 
2 
5 


1 - 2(1-$)" 4 (2-0)? + (L-§)"7 


~ (2-0) + Hg ~ Ede) 


id nel 
+ 5(1 -¢c) 


eu e.-g)e* + G(- Shee (87) 


Equation 87 is written for the probability of an A, response to 


1 
ac ontrial ng; however, the expression for probability of an Ap 
response to be is identical, and consequently tq. 87 expresses also 


the probability, Ph? of a correct responsé on any trial, without 


regard to the stimulus pattern presented. ‘A simple estimator ‘of%. 


thes..conditioning:; parameter ec is now obtainable by summing the error 


probability over trials. Letting ‘e denote the expected total errors 


during learning, we have 


An example ‘of the: sort: of prediction involving a relatively direct 


assessment of transfer effects is >the following. Suppose the. first 


A.and BE. -208- 





stimulus pattern to appear is ac}; the probability of a correct 
response to it is, by hypothesis, z » and if there were no transfer 


between patterns, the probability of a correct response to be when it 





first appeared on a later trial should be z also. Under the assump- 


tions of the mixed model, however, the probability of a correct response 


to be, if it first appeared on trial 2, should be 





i Be 
[lL - s(l-¢) - cl] +% oo 
So 


if it first appeared on trial 3, should be 


1 2 ue 
B(1-c¢) +5 
a 








1 c e 
=g-pll-s) 3 


and so on, tending to i after a sufficiently long prior sequence of 


ac trials. 





Simply by inspection of the transition matrix, we can develop an 


interesting prediction concerning behavior during. the presolution period 





of the experiment. By presolution period, we mean the sequence of trials 
prior to the last error for any given subject. We know that the subject 
cannot be in State 5 on any trial prior to the last error. On all trials 
of the presolution period, probability of a correct response should be 
equal either to z (if no conditioning has occurred) or to 4 (if 


exactly one of the two stimulus patterns. has been conditioned to its 


correct. response). ..Thus the. proportion, which we. may. denote by Pos ’ 
( 
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of correct responses over the presolution trial sequence should fall 


in the interval 


ror 
1A 
bas] 
1A 
foe eal 


and, in fact, the same bounds obtained foreae subset of trials within 
the presolution seguence. Clearly predictions from this model concern- 
ing presolution responding differ sharply from those derivable from any 
model that assumes a continuous increase ei probability of correct 
responding during the presolution period; this model also differs, 
though nob sO sharply, from @ pure "insight" model assuming no eating 
on prexotation trials. So far as we iowe no data relevant to these 
differential predictions are available in the literature (though wind 
lar predictions have been tested in somewhat different situations: 
Suppes and Ginsberg, 1962a; Theios, 1961). Now that the predictions are 
in hand, it seems likely that pertinent analyses will be forthcoming. 
The development in this section was for the case where there were 
only three cues a, b and c. For the more general case we could as- 
sume that there are N, cues associated with stimulus a, XN, with 
stimulus b, and N, with stimulus c. If we assume, as we have in 
this section, that experimental conditions are. such to ensure the sub- 
ject's sampling all cues presented on each trial, then Eq. 87 may be 


rewritten as 
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1 eyn-L 1 n-1 
Pr(A, |8¢) =sl- 5(1 + w,)(2 - a) ae w, (1-e) 
BB e\neL ab n-1 } 
Pr(A, ,,/be) =l- mee + Wo)(L = 5) +S w,(1-¢) 
No ¢ 
where a and Wy =e Further, 
a c ‘b c 





e= ) (Blr-Pr(Ay fee] + gli-Pr(A, ,[be)1) 





where W = + (w, + Wo) The parameter W is an index of similarity be- 

tween the stimuli ac and be; as Ww “azpyoucting its maximum value of 
Ly the number of total errors increases. Further the proportion of 
comrect seenatees over the presolution trial sequence should fall in 


either the interval 








1 Lt, tel 

BS Pag S Bt E Cw) 
or the interval 

1 L 1 

28 Pos SoE (1-w,) i 


depending on whether ac or be is conditioned first. 


6.3 Component Models 
So long as the number of stimulus patterns involved in a discrin~ 
ination experiment is relatively small, an analysis in terms of an 


appropriate case of the mixed model can be effected along the lines 
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indicated in Sec. 6.2. But the number of cues need become only moder- 
ately large in order to generate a number of patterns so great as to be 
unmanageable by these methods. However, if the number of patterns is 


large enough so that any particular pattern is unlikely to be sampled 
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more than once during an experiment, the emendations of the response 
rule presented.in Sec. 6,2 can be. neglected and the process treated as 
a simple extension of the component model. of Sec. 5.1. 

Suppose, for example, that a classical discrimination involved a 
set 8S, of cues available only on trials when A 


1 


set So of cues available only on trials when Ag is reinforced, and 


1 is reinforced, a 








a set S, of cues common to S, and 55 3 further, assume that a constant 


1 
fraction of each set presented is sampled by the subject on any trial. 


If the two types of trials occur with equal probabilities, and if the 
numbers of cues in the various sets are large enough so that the number 
of possible trial samples is larger than the number of trials in the 
experiment, then we may apply Eq. 53 of Sec. 4.3 to obtain approximate 
expressions for response probabilities. For example, asymptotically 


all of the Ny elements of S, and half of the N, elements of 8, 


1 


(on the average) would be conditioned to response A and therefore 


aNd 


probability of Ay ona trial when 5, was presented would be predicted 


1 
by the component model to be 


1 
LteN, 

q a 
N+, 





Pr(A, |S, ) = 


which will, in general, have a value intermediate between = and unity. 


hole 


Functions for learning curves and other aspects of the data can be de- 
rived for various types of discrimination experiments from the assump- 


tions of the component model. Numerous results of this sort have been 
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published (Burke and Estes, 1957; Bush and Mosteller, 1951b; Estes, 
1958, 196la} Estes, Burke, Atkinson, and Frankmann,1957; Popper, 1959; 


Popper and Atkinson, 1958). 


6.4 Analysis of a Signal Detection Experiment 

Although thus far we have developed stimulus sampling models only 
in connection with simple associative learning and discrimination learn- 
ing, it should be noted that such models may have much broader areas of 
application. On occasion sie may even see possibilities of using the 
concepts of stimulus sampling and association to interpret experiments 
that, by conventional classifications, do not fall within the area of 
learning. In this section we examine such a case. 

The experiment to be considered fits one of the standard paradigms 
associated with studies of signal detection (see, e.g., Tanner and 
Swets, 1954; .Swets, Tanner and Birdsall, 1961). The subject's task 
in this experiment, like that of an observer monitoring a radar screen, 
is to detect the presence of a visual signal which may occur from time 
to time in one of several possible locations. Problems of interest in 
connection with theories of signal detection arise when the signals are 
faint enough so that the observer is unable to report them with complete 
accuracy on all occasions. One empirical relation that we would want 
to account for, in quantitative detail, is that between detestaon proba~- 
bilities and the relative frequencies with which signals occur in differ- 


ent locations. Another is the improvement in detection rate that may 
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occur over a series of trials even when the observer receives no 
knowledge of resuktse'si. 
A possible way of accounting for the "practice effect" is suggested 
by some rather obvious analogies between the detection experiment and 
the probability learning experiment considered earlier: We would ex- 
pect that, when the subject actually detects a signal (in terms of 
stimulus sampling theory, samples the corresponding stimulus element), 
he will make the appropriate verbal report. Further, in the absense of 
ace other information, this detection of the signal may act as a rein- 
forcing event, leading to conditioning of the verbal report to other 
cues in the situation which may have been available for sampling prior 
to the occurrence of the signal. If so, and if signals occur in some 
Locations more often than in eeiees: then on the basis of the theory 
developed in earlier sections we should predict that the subject will 
come to report the signal in the preferred location more frequently 
than in others on trials when he ares detect a signal and is forced 
to respond to background cues. These notions will be made more explicit 
in connection with the following analysis of a visual recognition exper- 
iment reported by Kinchla (1962). 

, Kinchle employed a forced-choice visual detection situation 
involving a series of over 900 discrete trials for each subject. Two 
areas were outlined on a uniformly illuminated milk siwae screen. Each 
trial began with an auditory signal. -During tig andttory signal one of 


the following events occurred: 
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(1) A fixed increment in radiant intensity occurred in area 1 - 


a Ty type trial. 


(2) A fixed increment in radiant intensity occurred in area 2 ~ 


a Ty type trial. 


(3). No. change in the radiant character of either signal area 


occurred - a T) type trial. 


Subjects were told that a change in illumination would occur in 
one of the two areas on each trial. Following the auditory signal, the 


subject was required to make either..an A, or Ag response (i.e., 


1 
select one of two keys placed below the signal areas) to indicate which 
eek he believed had changed in Gel acineass The subject was given no 
information at the end of the trial as to whether or not his response 
was correct. Thus, on a given trial one of three events occurred (rT, 5 
T% 5 T) ), the subject teas either an AL or A response, and a 
short time later the next trial ne 

For a fixed signal intensity the experimenter has the option of 
specifying a schedule for presenting the qT events. Kinchla selected 


c 


a simple probabilistic procedure in which Pr(T, a) = 6 and 
? 


gL + 85 + §5 = 1. Two groups of subjects were run. For Group I, 


§, = 8 = 4 and §) = 2+ For Group II, &) = &) 


The purpose of Kinchla's study was to determine how these event schedules 


= .2 and Bo = 6s 


influenced the likelihood of correct detections. 
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The model that we will use to-analyze the experiment combines 

: two quite distinct processes: a simple perceptual prodens defined 

with regard to the signal events and a learning process associated 
with ackeroian cues. The stimulus situation is conceptually repre- 
sented in terms of two sensory elements 8) 


to the two alternative signals, and a set Sof elements associated 


‘end Spy corresponding 


with stimulus features common to all trials. On every trial the sub- 
ject is assumed to sample a single element.from the background set § 
and he may or may not sample tia of the sensory elements. If the Sy 
element is pompied. an AL occurs; if 8p is sampled an Ay 
occurs. te eae Raneucsd element is sampled ‘the subject makes the 
feanetce to which the background element is conditioned. Conditioning 
of elements in § changes Pte triel to trial via a learning process. 


_ The sampling of sensory elements depends on the trial type  ( Th 5 


uy 


9 Ty d. and is deserthed by a simple probabilistic model. The 


learning process associated with S- is assumed to be the multi- element 
pattern model presented in Sec. 3. Specifically, the assumptions of 


the model are embodied in the following statements: 


ale If T, (i = 1, 2) occurs, then sensory element s, will be 


sampled with probability h (with probability 1-h, neither 


8, nor 8, will be sampled). If To oceurs, then neither 


8) nor 85 will be sampled. 
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ey Exactly. one element is sampled from S. on every trial. 





Given the set S of N elements, the probability of | 


sampling a particular element is = ° 


Be if 55 (i=1, 2) is sampled on trial n, then with 


probability c' the element sampled from S on the 


trial becomes conditioned to A, at the end of trial n. 


> is sampled, then with probabil- te 


ity ¢ the element sampled from S ‘becomes conditioned 


“Tf. neither 8, nor s 


with equal likelihood to AL or Ay at the end of trial n. 


4, If sensory element s, is sampled, then A, will occurs 


i 
If neither sensory element is sampled, then the response 
to which the sampled element from S is conditioned will 


occur. : 


If we let. Py denote the. expected proportion of elements in §& 
conditioned to AL at the start of trial +n, then (in terms of state- 
7 
ments 1 and 4 above) we can immediately write an expression for the 


likelihood of an Ay response given.a qT, event; namely, 


Pr(A 


itl) a ae (1-b)p, (88a) 


Pr(Ay 1%.) h + (1-h)(1-p,) (88b) 


Pr(Ay lo, n) =P, (88e) 
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The expression for Py, can be obtained from statements 2 and 3 by the 
same methods used throughout Sec. 3 of this chapter and is as follows 
(for a derivation of this result see Atkinson, 1962a): 

i 


Py = Py 7 (y, - Pt - Hatny >, 


where a= €,he' + (1-h)S +E hs, b= Ephet + (L-h)s + Ets ; 


and PR = =a » Dividing the numerator and denominator of Po by c 


: yields the expression 


L,. a 
§ hy + 5(1-h) + Bohs 


Pe (L=8o)( bey) + EQ? (89) 


where Y = “, Thus, the eeaiesete expression for Py does not depend 
on the absolute values of c! and e¢ but only on their ratio. 

An inspection of Kinchla's data indicates that the curves for 
Pr(A,|T,) are extremely stable over the last 400 or so trials of the 
experiments; consequently we shall view this portion of the data as 
asymptotic. Table 7 presents the observed mean values of Pr(A,|T,) 
for the last 400 trials. The corresponding asymptotic expressions are 


specified in terms of Eq. 88 and Eq. 89 and are simply 


Insert Table 7 about here 


itt et meter niet 
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Table 7 


Predicted and Observed Asymptotic Response Probabilities 


for Visual Detection Experiment 


Group II 


Predicted Predicted 
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lim Pr(A [T) a 


rae Jeht(l-h)p,- (902) 
n 70 ? 


Lim Pr(Ay .|t:,) = h + (L~h)(L-p,). (90b) 
n-® 4 - 
Lim Pr(A, site nd = Py * ae (90c) 
nv.o 4 ., 


‘In order to generate asymptotic predictions we need values for h and 
vy » We first note by inspection of Eq. 89 that Py = s for Group Ts 
in fact, whenever fy = bo aie have PF s » Hence, taking the observed 
asymptotic value for (Pr(A)|T,) in Group I (i.e., .645) and setting 
it equal to h + (1-n)d yields an estimate of h = .289 . The back- 
ground illumination and the increment: in radiant intensity are the same 
for both experimental groups and therefore we would require an estimate 
of kh obtained from Group I to be applicable to Group II. In order to 
estimate y , we take the observed asymptotic value of Pr(A, |Z,) in 
Group II and set it equal to the right side of Eq. 89 with h = .289 , 
g) = bo = .2 and €. = 26 3 solving for y we abiada q = 2.8 6 
Using these estimates of h and y and Eqs. 89.and 90 yield the 
asymptotic predictions given in Table 7. 

Over all the equations give an excellent account of these panetialian. 
response measures. However, a more crucial test of the model is provided 
by an analysis of the sequential data. To indicate the nature of the 
sequential predictions that can be sbieinea, consider the probability 


of an A, response ona T, trial given the various trial types and 


1 L 


responses that can occur on the preceding trial, i.e., 
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Pr(Ay aa! natant 5,0) - 

where is1, 2 and j=0, 1, 2. Explicit expressions for these 
quantities can be derived from the axioms by the same methods used 
throughout this chapter. To indicate their form, theoretical expres- 


sions for lim Pr(A |r 


now 


1 ntl ants, y,n) will be given and, to 
simplify notation, they will be written as Pr(A,|T,A,7,) - The 


expressions for these quantities are as follows: 


{h + (L-n)8]p, + (1-p,)hy' 
(1-4)8Ip,, Ppby'  (w-1)x 





Pr(A;|T)A,7)) = Tre = (91s) 
Pr(A,|T,A,T,) = cei + Gee (90) 
Sahay se es ES are tes 
Pr(A,|T,4,T) = aac ae te x (914) 
Pr(A,|T,A,To) = - + Mx (9le) 
Pr(A, |T,A5To) = S + N= Ue (91f) 

where y = cth+(l-c'), y! = ct +(1-c')h, & ssh + (1-5), 

Bl = 5 + (1-5)h >X=aht (1-b)p,, , ond Y= : + (1-h)(1-p,) - 


A. and BE. -219- 


Tt is interesting to note that the asymptotic expressions for 

lim Pr(A, !T5 2) depend only on h and Ww, whereas the quantities 
in Eq. 91 are functions of all four parameters N,c,c' and h. 
Comparable sets of equations can be written for Pr(Aj|TpA,7,) and 
Pr(A, ITA) 

The expressions in Eq. 91 are rather formidable, but. numerical pre~ 
dictions can be easily calculated once values for the pupeusters have 
been obtained. Further, independently of the parameter values, certain 
relations among the sequential probabilities can be specified. As an 


example of such a relation, it can be shown that Pr(a; | ) > 


179 
Pr(A, |T,4,7,) for any stimulus schedule and any set of parameter values. 


To see this, simply subtract Eq. Q91f: from Eq. Yle and note that 65> 6' . 


Insert Table 8 about here 





In Table 8 the observed values for Pr(A,|T,A,T,) are presented as 
reported by Kinchla. Estimates of these conditional probabilities were 
computed for individual subjects using the data over the last 400 trials; 
the averages of these individual estimates are the quantities given in 
the table. Each entry is based on 24 subjects. 

In order to generate theoretical predictions for the Soudeved 
entries in Table 8 values for N,c, c' and h are needed. Of course, 
estimates of h and y = g already have been made for this set of 
data, and therefore it is only necessary to estimate N and either c 


or c! . We obtain our estimates of N and c by a least squares 
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Table 8 


Predicted and Observed Asymptotic Sequential Response 





Probabilities in Visual Detection Experiment 





| rat | aoe cd 





Observed Predicted Predicted 





eee 57 58 .59 64 

Pr(Ay|T>A5T, ) 65 69 .70 +76 
Pr(A,|T>A,To ) “Th -7L 79 TT 
Pr(A,| TA, T>) 661 59 .69 66 
Pr(A5|T3A,To) 4 259 68 .66 
Pr(A,|TA,T,) -66 -70 | mf 76 
Pr(A _!t\A {ni ‘DB “TL 70 65 
Pr(A,|T,4,7,) 62 59 259 52 
Pr(A, \r Ants ) “53 -58 “5D » 51 
Pr(A a is) 66 70 64 6} 
Pr(A ,|t)A ito) 72 -70 261 63 
Pr(A, TA 27) 61 +59 48 52 
Pr(A, \z oft) 38 «40 47 49 
Px(Ag|T Apt, ) 56 58 59 .66— 
Pr(A,|T pT) 64 .60 67 .68 
Pr(4,|Tg4, 2) 47 42 51 BE 
Pr(A,|T QA, To) 47 42 50 +51 





Pr(A3|TyAoTo) -60 58 65 66 
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method; i.e., we select a value of N and ec (where c! = oy) so that 
the sum of squared deviations between the 36 observed values in Table 8 
and the corresponding theoretical quantities is minimized. The theoreti- 
cal quantities for Pr(Ay‘|T,4,7.) are computed from Eq. 91; theoretical 
expressions for Pr(A,|T,4,7,) and Pr(A|T)A,T,) have not been pre- 
sented here but are of the same general form as those given in Eq. 91. 


Using this technique, estimates of the parameters are as follows: 


N = 4.23 ec! = 1.00 
(92) 
h= .289 co = 6D57 


The predictions corresponding to these parameter values are presented 

in Table 8. When one considers that only four of the possible 36 degrees 
of freedom represented in Table 8 have been utilized in estimating pa- 
rameters, the close correspondence between theoretical and observed 
quantities may be interpreted as giving considerable support to the 
assumptions of*-the..madel. 

A great deal of research needs to. be done to explore the consequences 
of this approach to signal detections, In terms of the experimental pro- 
blem considered in this section much progress can be made via differential 
tests among altermative formulations of the model. For example, we 


postulated a multi-element pattern model to deseribe the learning pro- 





cess associated with background stimuli; it would be important to deter- : 


mine whether other formulations of the learning process such as those 
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developed in Sé¢c. 5 or those proposed-by Bush and Mosteller (1955) 


would provide as good or even better theoretical fits than the ones 





displayed in Tables 7 and 8. Also, it would be valuable to examine 
variations in the scheme for sampling sensory elements along lines 
developed by Iuce (1959) and Restle (1961). 


More generally, further development of the theory is required 





before one could attempt to deal with the wide range of empirical 


phenomena encompassed in the approach to perception via decision theory 





proposed by Swets, Tanner, and Birdsall (1961) and others. Some theo- 
retical work has been done by Atkinson (1961b) along the lines outlined 

in this section to account for the ROC (receiver-operating-characteristic) 
curves that are’typically observed in detection studies and to specify 
the relation between. forced-choice and yeseno experiments. However, 

this work is. still quite tentative and an evaluation of the approach ~ 
will require extensive analyses of the detailed sequential properties 


of psychophysical datas 





625 Multiple Process Models 

Analyses of certain behavioral situations have proved to require 
formulations in terms of two or more distinguishable, though possibly 
tis Gatependeut, learning. processes that proceed simultaneously. For 
some situations these separate processes may be directly observable; 
for other situations we may find it advantageous to postulate processes 


that’ are unobservable but which determine in some well-defined fashion 





the sequence of observable behaviors. 
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For sets, in Restle's (1955) treatment of discrimination 
learning it is assumed that irrelevant stimuli may become "adapted" 
over a period of time and ‘thus \.bé.:. rendered nonfunctional. Such an 
analysis entails... a.. two-process system. One process has to do with 
the conditioning of stimuli to responses, .whereas the other process 
prescribes both the conditions under which cues become irrelevant and 
the rate at which adaptation occurs. 

Another application of multiple process models arises with regard 
to discrimination problems in which either a covert or a directly ob- 
servable orienting response is required. One process might describe 
how the stimuli presented to the subject become conditioned to discrim- 

inative responses. Another process might specify the acquisition and 
extinction of various orienting responses; these orienting responses 
would determine the specific subset of the environment that the subject 
would perceive on a given trial. For models dealing with this type of 
problem see Atkinson (1958), Bower (1959), and Wyckoff (1952). 

As another example, consider a cgorpuecens scheme developed by 
Atkinson (1960) to account for certain types of discrimination behavior. 
This model makes use of the distinction, developed in Secs. 3 and 4 of 
the present chapter, between component models and pattern models and 
suggests that the subject may (at any instant in time) perceive the 
stimulus situation either as a unit pattem or as a collection of 
individual components. Thus, two perceptual states are defined; one 


in which the subject responds to the pattern of stimulation and one in 
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which he responds to be separate components of the situation. Two 
learning processes are also defined. One process specifies how the 
patterns and components become conditioned to responses, and the second 
process describes the conditions under which the subject shifts from 

one perceptual state to:another. The control of the second process is 
governed by the reinforcing schedule, the subject's sequence of responses, 
and by similarity of the discriminanda. In this: model neither the condi- 
tioning states nor the perceptual states are observable; nevertheless, 

the behavior of the subject is rigorously defined. in terms of these 
hypothetical states. 

Models of the sort described above are generally difficult to work 
with mathematically and consequently have had only limited development 
and analysis. It is for this reason that we select a-particularly 
simple example to illustrate the type of formulation that is possible. 
The example deals with a discrimination learning task investigated by 

tkinson (1961a) in which observing responses are categorized and di- 
rectly measured. 

The experimental situation consists of a sequence of discrete 


trials. Hach trial is specified in terms of the following classifications: 


Ty? Trial type. Each trial is either a Ty or @ T - The 


trial type is set by the experimenter and determines in 


part the stimulus event occurring on the trial. 
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vp Rot Observing responses. On each trial, the subject makes 


either an RL or Ro: The particular observing response 


determines in part the stimulus event for that trial. 


81, Sys By Stimulus events. Following the observing response, one 
and only one of these stimulus events (discriminative cues) 


occurs. Ona qT trial either s or 5s can occur; on 


1 b 
T trial either 5 or s. can occur 5 
salar 2 'b : 


15 The subscript b has been used to denote the stimulus event that 


may occur on both qT and Ty trials; the subscripts 1 and 2 denote 


stimulus events unique to T 


1 and qT, trials, respectively. 


Ay, Agi Diseriminetive responses. On each trial the subject makes 
either an AL or Ag response to the presentation of a 


stimulus event. 


04s Op: Trial outcome. Each trial is terminated with the occurrence 


of one of these events. An Oy indicates that AL was 


the correct response for that trial, and O5 indicates 


that Ap was correct. 


The sequence of events on a trial is as follows: (1) The ready 


signal oceurs and the subject responds with R, or Rp. (2) Following 


y? 8 oF 8, is presented. (3) To the onset 


of the stimulus event the subject responds with either A) or Ay. (4) 


the observing response s 


The trial terminates with either an OL or O5 event. 
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To keep the analysis simple we consider an experimenter controlled 
reinforcement schedule. On a T, trial, either an Oy. occurs with 


probability «x or an 0, with probability 1l-x« ona T, trial — 


Dp: 
or an Oo, with probability 1-1 


1? 2 2 


an Oo, occurs with probability Tos 


The Tt) type trial occurs with probability B and T 


2" 
2 with probability 


1-6. Thus a qh - 0, combination occurs with probability Br. 3 a 


T, - 0, with probability 6(1 - 1,)3 and so on. 


The particular stimulus event Sy (i = 1, 2, d) that the experi- 


menter presents on any trial depends on the trial type (Ty or Ty) and 


the subject's observing response (Ry or Ry). Specifically: 


“(4) -If an R, is made then 


(a) with probability a the 8, event occurs on a 


use trial and the 85 event on a Tt trial. 


(b) with probability 1-@ the s_ event occurs, 


b 
regardless of the trial type. 


(ii) -If an R, is made then 


(a) with probability @ the s_ event occurs, 


b 
regardless of the trial type; 


(b) with probability 1-@Q the s event occurs on 


HE 
a@ TL. trial and Son a Ty trial. 
To clarify this procedure, consider the case where a = 1, t= 1, 
and Ty = 0: ‘If the subject is to be correct on every trial, he-must 
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make an Ay on a qT type ootee and ean Ap ona Ty type trial. 
However, the subject can only ascertain the trial type by making the 


appropriate observing response. That is, R, must be made in order to 


identify the trial type, for the occurrence of R, always leads to the 


presentation of s, regardless of the trial type. Hence, for perfect 


responding the subject must make Ry 


make AL to 8) or Ap to 85 - The purpose of ‘the Atkinson study 


with probability 1 and then 


was to determine how variations in Tyo Ro and @ would affect both 
the observing responses and the discriminative responses» 

Our analysis of this experimental procedure will be based on the 
axioms presented in Secs. 2 and.3. However, in order to apply the theory 
we mist first identify the stimulus and reinforcing events.in terms of 
the experimental operations. The identification we offer seems giite 
natural to us and is in accord with the formulations given in Secs. 2 
and 3. 

We assume that associated with the ready signal is a set 5, of 
pattern elements. Each element in 5p is conditioned to either the 
Ry or the Ry observing response; there are N‘ such elements. At 
the start of each trial (i.e., with the onset of the ready signal) an 
element is sampled from Sp and the subject makes the response to which 
the element is conditioned. 

Associated with each stimulus event ey fon, 2,.b) isaset 5S 


i 


of pattern elements; elements in 5, are. conditioned to either the Ay 


or the A, discrimination response. .There are. N such elements in each 
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set S, and, for simplicity, we assume the sets are pairwise disjoint. 


When the stimulus event S$, occurs one element is randomly sampled from 


8, and the subject mades the discriminative response to which the ele- 


ment is conditioned. 
Thus, we have two types of learning processes; one defined on the 


set S$, and the other defined on the sets S._, S and S° Once the 


R 1’? “b 


reinforcing events have been specified for these processes we can apply 
our axioms. The interpretation of reinforcement for the discrimination 
response process is identical to that given in Sec. 3. If a pattern 
element is sample from set 8, for i-=1, 2, band followed by an 


95 (j= 1, 2) outcome, then with probability ec the element becomes 


condtioned to A, and with probability 1 -c¢ the conditioning state 
of the sampled element remains unchanged, 


The conditioning process for the 52 set is somewhat more complex 


in that the reinforcing events for the observing responses are assumed 
to be subject-controlled. Specifically, if an element conditioned to 


R, is sampled from Sp and followed by either an A), or Ag05 


event, then the element will remain conditioned to Ris however, if 


Ay% or A,0, oceurs, then with probability c' the element will 


become conditioned to the other observing response. Otherwise stated, 


if an element from Sp elicits an observing response that selects a 


stimulus event and, in turn, the stimulus event elicits a correct dis~- 


crimination response (i.e., A,0 


4 oF 4,05) then the sampled element 
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will remain conditioned to that observing response. However, ifthe 
observing response selects a stimulus event that gives rise to an in~ 
‘correct discrimination response (i.e., A) % or A501) » then there 
will be a decrement in the tendency to repeat that observing response 
on the next trial. 

Given the above identification of events we can now canavate a 
mathematical model for the experiment. To simplify the analysis we let 
wt =Ne=1 35 namely, we assume that there is one element in Sant of our 
stimulus sets and consequently the single element is sampled with proba- 
bility 1 whenever the set is available. with this restriction we may 
describe the conditioning Seats of a subject, at the start of each trial, 


by an ordered four tuple <ijkZ> where 


(1) the first member i is 1 or 2 and indicates whether the 


single element of 5S. is conditioned to R, or Ry a 


(2) the second member j is 1 or 2 and indicates whether the 


single element of Ss) is conditioned to AL or Ay 3 


(3) the third member k is 1 or 2 and indicates whether the 


element of 8, is conditioned to AL or Ay 3 


(4). the fourth member. £ is 1 or 2. and indicates whether the 


element. of S5 is conditioned to AL or Ay ‘ 


Thus, if the subject is in state < ijk#> he will make the Ry 


observing response; then, to s s. or s he will make discrimi- 


1 b 2? 
native response Ay: A, or Ap respectively. 
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From our assumptions it follows that the sequence of random variables 
that take the subject states <ijkJé> as values is a 16 state Markov 


chain. Figure.lO displays the possible transitions that can occur when 





Insert Figure 10 about here 





the subject is in state < lie2 > on trial n. To clarify this tree, 


let us trace out the top branch. An R, is elicited with probability 1 


1 
and with probability Bas a Ty trial with an O1 outcome occurs; i 
further, given an R, response on. a qT trial there is probability @ 
that the Sy stimulus event occurs; the onset of the By event elicits 


a@ correct response and hence no change occurs in the conditioning state 
of any of the stimulus patterns. Now consider the next set of branches: 


an. RL occurs and we have a TO) trial; with probability 1-Q the 


5 stimulus is presented and an Ay occurs; the A response is in- 


correct (in that it is followed by an oO, event), hence with proba- 


bility ¢ the element of set 55 becomes conditioned to AY and 


with independent probability c' the element of set 5p becomes 
conditioned to the alternative observing response, namely Ry * 

From this tree we obtain probabilities corresponding to the < 1122 > 
row in the transition matrix. For example, the probability of going ; : 
from <1122> to <2112> is simply Bn, (1-a)cc! + (1- B)xp(1-a)ec! : 
that is, the sum over branches 2 and 15. An inspection of the transition 
matrix yields some important results. For example, if @=1, ut = ey 
and Mo = O then states < 1112 > and <1122> are absorbing and 
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Fig. 10. Branching process, starting in state <1l22>, for a single 
trial in the two-process discrimination learning model. 
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hence in the limit Pr(Ry yell, Pedy, nT, n) zi» and Pry a im Ty nal 


As before; let ce denote the probability of being in state 


<ijkZ> on trial n:y when the limit exists let’ u, dyke ™ aes U. 
> 0 


Experimentally, we shall be interested in evaluating the euiovtad 


theoretical predictions: 


pr(R Vek fn) (x) yo) + (n) 





aya? * Mya. * Mare * isi * Yy122 
(n) (n) (n) (n) 
* Uloia + oie * Yy201 + M220 (938) 
: yoy 
= yf) 4 yr) (n) (n) 
Pr(4y, IT, 2) = Ua71 * 42 * “2141 7 Yerie | 


\ et 
(n) (n) (n) (n) | 
+ Miia, + Yrge + YMgeiy * gare] 


+ (2-a) fufEl, + wl), + hs + wlio] (930) 


ul?) (2), (2) 


Pr(Ay Ty) = wh + Yeir * Maria * Meet 
+ otal), + wlEd + athe + wBfel 
+ (a) (aft + ula + usd, + wfed,] (930) 
Pr(Ry A Ayn) = ad iii Deen a a ae 


Poo Ba 





+158) yyy + M097] (95a) 
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Pr(Rp Ay) = SN ig an + (1 ~a)ug), 


+ha-o) oh, + of] 


+ (r- Ra - a) a), + (93e) 


The first equation gives the probability of an R, response. The 


1 


second and third equations give the probability of an AL response .on 


T, and T, trials, respectively. Finally, the last two equations 


1 2 


present the probability of the joint. occurrence. of each observing 


response with an A response. 


1 


In the experiment reported by Atkinson (1961a) six groups were run 


with 40 subjects in each group. For all groups nm, = -9 and B= .5. 


1 


‘The groups differed. with respect to the value of @ and «x For 


2 
Groups I-III, the value of @= 1; and for Groups IV-VI, @= .75. For 


Groups I and IV, Ko = -9; for II and Vv, ly = .53. and for Groups III 
and VI, Ro = -1. The design can be described by the following array: 
BP) 
oe] oo) ok 
1.0 I IT TIT 
a 
“15 Iv Vv VI 


Given these values of x T. » @ and B our 16 state Markov 


1 ? 
(n) 


jks = Us 518 exists 


chain is irreducible and aperiodic. Thus, Jlimu 
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and can be obtained by solving the appropriate set of 16 linear equations 
(see Eq. 16). The values predicted by the model are given in Table 9 


for the case where c= c!'. Values for the u s were computed 


1 
igke 
Insert Table 9 about here 


and then combined by Eq. 93 to predict the response probabilities. By 
presenting a single value for each theoretical quantity in the table we 
imply that these predictions are independent of c and c' . Actually 
this is not always the case. However, for the schedules employed in 

this experiment ‘the dependency of these asymptotic predictions on o and 
e' is virtually negligible. For ec =e! ranging over the interval 
‘fron » 0001 to 1.0 the predicted values given in Table 9 are affected 
in only the third or fourth decimal place; it is for this. reason that 

ye present theoretical values to only two decimal places. 

In view of these comments it should be clear that the predictions 
in Table 9 are based solely on the experimental parameter values. 
Consequently, differences between subjects (that may be represented by 
intersubject variability in ¢ and c') do not substantially affect 


these predictions. 
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Table 9 


Predicted and Observed Asymptotic Response Probabilities 


in Observing Response Experiment 





















Group I 


90 | .94 J .okuk ET . 


Group. ET 


. : 28794 .19 fe 


Group IIT 





Pr(A,12;) 


Pr(A, 25} 
Pr(R,) - 285 


/Pr(RyMA,) 


Pr(A\|7)) 


Pr(A, |p) en 
Pr(Ry) -263 
Pr(R, (\A,) 138 


Pr(Ry ) A) 
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In the Atkinson study 400 trials were run and the response propor- 
tions appear to have reached a fairly stable level over the last half 
of the experiment. Consequently, the proportions computed over the 
final block of 160 trials were used as estimates of asymptotic quantities. 
Table 9 presents the mean and standard deviation of the 40 observed pro- 
portions! obtained under each experimental condition. As can be seen, 
the agreement between theoretical and observed quantities is fairly good. 
Despite the fact that these gross asymptotic predictions hold up 
‘quite well, it is. obvious that some of the predictions from the model 
will not be confirmed. The difficulty with the one-element assumption 
is that the fundamental theory laid down by the axioms of Sec. 3 is 
completely deterministic in many respects. For example, when N'=1 


we have 


Pr(R yal ; 


nytt! Ay Pan 
namely, if an R, occurs on trial n and is reinforced (i.e., followed 


by an A,O, event) then R, will reoccur with probability 1 on trial 


a Baek aL 


n+l. This prediction, of course, is a consequence of. the assumption 
that we ‘mawe but one element in set Sp which necessarily is sampled 

on every trial. If we assume more than one element, the deterministic 
features of the model no longer hold and such sequential statistics 
become functions of ¢ , c' , N and N' . Unfortunately, for elaborate 
experimental. procedures of the sort described in this section, the multi- 


element case leads to complicated mathematical processes for which it is 
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extremely difficult to carry out computations. Thus, the generality 
of the multi-element assumption may often be offset by the difficulty 
involved in making predictions. 

Naturally it is usually preferable to choose ee the available 
models the one that best fits the data, but in the present state of 
psychological knowledge no single model is clearly superior to all others 
in every facet of analysis. The one-element assumption, despite some of 
its erroneous features, may prove to be a valuable instrument for the 
rapid exploration of a wide variety of complex phenomena. For most of 
the cases we have examined, the predicted mean response probabilities 
are usually Ceres (eae slightly dependent on) the number of 
elements assumed. Thus the one-element assumption may be viewed as a 
simple device for computing the grosser predictions of the general theory. 

For exploratory work in complex situations, then, we recommend using 
the one-element model because of the greater difficulty of computations 
for the multi-element models. In advocating this approach we are taking 
@ methodological position with which some scientists do not agree. Our 
position is in contrast to one which asserts that a model should be dis- 
earded once it is clear that certain of its predictions are in error. 

We do not take it to be the principal goal (or even, in many cases, an 
important goal) of theory construction to provide models for particular 
experimental situations. The assumptions of stimulus sampling theory 


are intended to describe processes or relationships that are common to a 


wide variety of learning situations, but with no implication that behavior . 


: 
i 
i 
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in these situations is a function solely of the variables represented in 
the theory. As we have attempted to illustrate by means of numerous 
examples, formulation of a model within this framework for a particular 
experiment is a matter of selecting the relevant assumptions, or axioms, 
of the general theory and interpreting these in terms of the conditions 
of the experiment. How much of the variance in a set of data can be 
accounted for by a model depends jointly on the adequacy of the theoret- 
ical assumptions and on the extent to which it has been possible to 
realize experimentally the boundary conditions envisaged in the theory 
théreby minimizing the effects of variables not represented. In our 
view, a model, in application to a given experiment, is not to be 
classified as "correct" or "incorrect"; rather, the degree to which it 
accounts for the data may provide evidence tending either to support or 


to cast doubt on the theory from which the particular model was derived. 
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